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Abstract. We establish a new mixing theorem for quasirandom groups (finite 
groups with no low-dimensional unitary representations) G which, informally 
speaking, asserts that if g, x are drawn uniformly at random from G, then 
the quadruple (g, x, gx,xg) behaves like a random tuple in G 4 , subject to the 
obvious constraint that gx and xg are conjugate to each other. The proof 
is non-elementary, proceeding by first using an ultraproduct construction to 
replace the finitary claim on quasirandom groups with an infinitary analogue 
concerning a limiting group object that we call an ultra quasirandom group, 
and then using the machinery of idempotent ultrafilters to establish the re- 
quired mixing property for such groups. Some simpler recurrence theorems 
(involving tuples such as (x, gx, xg)) are also presented, as well as some fur- 
ther discussion of specific examples of ultra quasirandom groups. 



1. Introduction 

In [19] . Gowers introduced the notion of a quasirandom group: 

Definition 1 (Quasirandom group). A finite group G is said to be D -quasirandom 
for some parameter D > 1 if all non-trivial unitarjy representations p : G — > Ud{C) 
of G have dimension d greater than or equal to D. 

We informally refer to a quasirandom group to be a finite group that is D- 
quasirandom for a large value of D. 

Example 2. The alternating group A n is n — 1-quasirandom for all n > 6. More 
generally, if G is perfect (i.e. G = [G, G}) and has no normal subgroup of index less 
than to, then G is ^/log to / 2-quasirandom; see [HI Theorem 4.8], which also asserts 
a converse implication (but with vTogm/2 replaced by ^/m). In particular, if G 
is a non-abelian finite simple group, then G is ydog|G| /2-quasirandom, where \G\ 
denotes the cardinality of G; more generally, if every group in the Jordan-Holder 
decomposition of G is a non-abelian finite simple group of order at least to, then G 
is \/log to / 2-quasirandom. On the other hand, for many finite simple groups, one 
can improve this logarithmic bound to a polynomial bound. For instance, the group 
SL2{F P ) is ^^--quasirandom for any prime p; see Lemma I52"1 More generally, see 
|24) for a precise computation of the quasirandonmess for finite Chevalley groups. 

One can combine these observations to obtain further examples of quasirandom 
groups. For instance, if p, q are distinct primes, then by the Chinese remainder 
theorem, SL^iTijpqTi) is isomorphic to the direct product of SL^iFp) and SL2(F q ), 
and so is min ( p > qS > _ 1 -quasirandom. 
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4f desired, one could replace "unitary" here by "C-linear" (thus relaxing Ud(C) to GLd(C)), 
because any linear action of a finite group preserves at least one Hermitian form by an averaging 
argument. 
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When D is large, such groups become mixing in the sense that averages such as 

for "typical" values of g tend to stay very close to (Eg/i)(Eg/ 2 ) for bounded 
functions fx, / 2 : G — > M, where we use the averaging notation 

E G / = E xeG /(z) :=|i^/(i). 

More precisely, we have the following inequality, essentially present in [19) (see also 
DP): 

Proposition 3 (Weak mixing). Let G be a D -quasirandom group for some D > 1, 
and let fx, / 2 ■ G — > C be functions. Then 

E geG \V xeG fx(x)f 2 (xg) - (E G /!)(E G / 2 )| < ^- 1/2 ||/l|| i2 (G)ll/2||L 2 (G) 

where \\f\\ mG) := (E G \f\ 2 ) 1/2 - 

Proof. Observe that the left-hand side does not change if one subtracts a constant 
from either fx or / 2 , so we may reduce to the case when / 2 has mean zero. By 
the Cauchy-Schwarz inequality, it thus suffices to show that the linear operator 
T = T h : L 2 {G) L 2 {G) defined by 

T h h{g) :=E xeG fx(x)f 2 (xg) 

has operator norm ||T|| op at most D~ 1 / 2 \\f2\\L 2 (G)- 

We may of course assume that / 2 is not identically zero. Let V be the space 
of functions / £ L 2 {G) such that \\T h f\\ L ^ G) = \\T\\ op \\f\\ L 2 (G) , i.e. the right 
singular space corresponding to the largest singular value ||T|| op . This is a vector 
space which is invariant under the action of right-translation Rhf(x) := f(xh) by 
elements of h £ G, and also does not contain any non-trivial constant functions, 
and hence by quasirandomness has dimension at least D. As T acts by a multiple 
of an isometry by ||T|j op on V, we conclude that the Hilbert-Schmidt norm 

||T||hs - (E^gI/^)! 2 ) 172 = ||/ 2 ||l»( G ) 
is at least D^ 2 \\T\\ op , and the desired bound ||T|| op < £>~ 1/2 ||/ 2 ||l2(g) follows. □ 

In particular, if G is D-quasirandom and /i, / 2 : G M are bounded in magni- 
tude by 1, we have 

E geG \E xeG fx(x)f 2 (xg) - (E G /i)(E G / 2 )| < D- 1 ' 2 

and hence by Markov's inequality P(|A| > A) < jE|X| we have 

\E xeG h(x)f 2 (xg) - (E G /i)(E G / 2 )| < D- 1 ^ 

for at least 1 — D~ X I A of the elements g £ G. Thus, when D is large, we heuristi- 
cally have the mixing property ~E xeG f 1 (x)f 2 {xg) sa (E G /i)(E G / 2 ) for most group 
elements g £ G. Specialising to the case of indicator functions fx — 1a, f 2 = Is 
of sets, we conclude the heuristic that ^[g^ 9 sa ISTIS"! ^ or mos ^ 9 e where 
Bg := {bg : b £ B} is the right-translate of B by g. 
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Remark 4. Proposition [3] has a counterpart in ergodic theory. Call an infinite 
group G weakly mixing if it has no non-trivial finite-dimensional representations. 
If G is a countable amenable group, one can show that G is weakly mixing if and 
only if, for every ergodic action (T g ) ge a of G on a probability space (X, fx), one has 




for all f\, fa € L 2 (X,[i), where T g f(x) := f(xg) is the right-translate of / by g, 
and Fi,F 2 , . . . is a F0lner sequence in G. See [7J, [8], [12] for further discussion of 
weak mixing for groups (including the case of non-amenable groups). 

In [19] , it was observed that Proposition [3] could be iterated to obtain similar 
weak mixing results for some higher order averages; for instance, three applications 
of the above proposition and the triangle inequality show that 

^ g ,hEG\^Gfx(x)fa(xg)fa(xh)U(xgh) - (E G / 1 )(E G / 2 )(E G / 3 )(E G / 4 )| < iD^ 2 

whenever fx, fa, fa, f a '■ G — > R have magnitude bounded by 1. However, not all 
multiple averages could be controlled non-trivially in this fashion; for instance, in 
[19] §6], the task of obtaining a mixing bound for the average 

E geG \F, xeG fa(x)fa(xg)fa(xg 2 ) - (E G / 1 )(E G / 2 )(E G / 3 )| 

was posed as an open problem. Some results for this average will be presented in 
the forthcoming paper [34] of the second author, using techniques quite different 
from those used here. In this paper, we will focus instead on the weak mixing 
properties of the average 

E x£G fa(x)fa(xg)fa(gx) 

for functions fa, fa, fa ■ G —> R, which in particular would control the behavior of 
the density |AnJ ^ gC| for sets A, B, C C G and typical g £ G. 

As already observed in |19[ §6], one cannot hope for absolute weak mixing for 
this average, due to the simple constraint that xg is conjugate to gx. For instance, 
if B is the union of some conjugacy classes in G and C is the union of a disjoint 
collection of conjugacy classes, then An Bg C\ gC is empty for every g. However, 
our main result asserts, roughly speaking, that these conjugacy classes form the 
only obstruction to weak mixing: 

Theorem 5 (Relative weak mixing). Let G be a D-quasirandom finite group for 
some D > 1, and let fx, fa, fa ■ G —> R be functions bounded in magnitude by 1. 
Then 

E geG \V xeG fa(x)fa(xg)fa(gx) - (E G fa)(E G faV(fa\l G )))\ < c{D) 

where E(/|Z G ) is the orthogonal projection of a function f to conjugation invariant 
functions, thus 

E(/|X G )(x) := Vgzcfigxg- 1 ), 
and c{D) is a quantity depending only on D that goes to zero as D — > oo. 

Thus, for instance, we have the following general bounds on \A n gB n Bg\ for 
most g: 
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Corollary 6. Let G be a D-quasirandom group for some D > 1, let e > 0, and let 
A, B a G. Then we have 

\A\,\B[. 3 _ \AngBnBg\ \A\ \B\ 

\G\ { \G\> ~ \G\ ~ \G\ \G\ 
for all but at most e~ 1 c(D)\G\ values of g € G, where c(D) goes to zero as D — > oo. 

Proof. The upper bound follows from Proposition [3] with fi := 1^ and / 2 := 1 B 
(replacing g by g _1 ), after crudely bounding \A n gB n Bg\ by \A n Bg\. Now we 
turn to the lower bound. By setting /i := 1a and f2 = h ■= 1b (and replacing 
g by this bound is immediate from Theorem [5] and Markov's inequality once 

one verifies that 

(1) E G l B E(l B |J G )>(j|j) 2 . 

But this follows from the identities 

E G 1 B E(1 B |J G ) = E G E(1 B |X G )E(1 S |X G ) 

E G E(1 B |Z G ) = j|j 

and the Cauchy-Schwarz inequality. □ 

Note from Theorem [5] that apart from improvements in the e~ 1 c(D) factor, 
both bounds in the above corollary are best possible without further hypotheses 
on _B, with the lower bound essentially attained when B is approximately evenly 
distributed among all (or almost all) conjugacy classes, and the upper bound es- 
sentially attained when B is the union of conjugacy classes. 

One can also view Theorem [5] more probabilistically: 

Corollary 7 (Relative weak mixing, again). Let G be a D-quasirandom finite group 
for some D > 1. Let x,g be drawn uniformly at random from G. Let x$, x\, X2, x 3 
be further random variables drawn from G, with xq,xi,X2 drawn uniformly and 
independently at random from G, and for each choice of xq, x±, X2, the random 
variable x% is drawn uniformly from the conjugacy class of X2- (Equivalently, one 
could take X3 := hx2h~ l , where h is drawn uniformly from G independently of 
xq,x\,X2)- Then the random variables {g,x,xg, gx) and (xq,x\,X2,X3) in G 4 are 
close in the following weak sense: whenever /o)/i,/2 7 /3 : G — > K are functions 
bounded in magnitude by 1, then 

mo(g)fi(x)f 2 (xg)f 3 (gx)--Ef (x )f 1 (x 1 )f 2 (x 2 )f 3 (x 3 )\ < c(D) 

where c{D) —¥ as D — > 00. 

The equivalence of Corollary [7] and Theorem [5] follows from a routine computa- 
tion which we leave to the reader. Informally, this corollary asserts that in a quasir- 
andom group, the only significant constraint on the tuple (x, g,xg, gx) for random 
x, g is the obvious constraint that xg and gx are conjugate to each other, at least 
for the purposes of computing "order 1 statistics" such as E,f (x)fi(x)f2(xg)f3(gx) 
involving products of quantities, each of which only involve at most one of the 
expressions x,g,xg,gx. 

We can also easily obtain the following combinatorial consequence of the above 
results, which can be viewed as a density version of the non-commutative Schur 
theorem from [5], in the case of quasirandom groups: 
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Corollary 8 (Density noncommutative Schur theorem). Let G be a D-quasirandom 
finite group for some D > 1. Let A,B,C C G with \A\, \B\, \C\ > 5\G\ for some 
5 > 0. Then, if D is sufficiently large depending on S, there exists g £ A,x G B 
with xg,gx g C and x, g, xg, gx all distinct. 

Proof. Let x,g be drawn uniformly at random from G, and let xq , x\ , xi , £3 be 
drawn as in Corollary [7] We have 

El A (x () )l B (x 1 )lc(x2)lc(x 3 ) = (|A|/|G|)(|B|/|G|)E G (1 C E(1 C |Z G )) 

and hence by (JT]) 

EU(x )1b(zi)1c(z2)Mz3) > 5\ 
By Corollary [3 we thus have (for D large enough) that 

El A {g)l B (x)l c (xg)lc(gx) > <5 4 /2, 

thus there are at least 5 4 \G\ 2 /2 tuples (g, x, xg, gx) with g £ A, x € B, and xg, gx € 
C. 

Now we eliminate those tuples in which g,x,xg, gx are not all distinct. Clearly 
there are at most 0(|G|) tuples for which g — x or for which one of g,x is equal 
to one of xg, gx. Now we consider those tuples for which xg = gx. By Burnside's 
lemma, the number of such tuples is equajj to |G| times the number of conjugacy 
classes of G. But from the Peter- Weyl theorem, the number of conjugacy classes 
is also equal to the number of non-isomorphic irreducible unitary representations 
p : G — > Ud p (C) of G, which obey the identity 

p 

If G is Z?-quasirandom, then d p > D for all non-trivial p, and so we see that the 
number of pairs x, g with xg = gx is at most 

|G|(1 + (|G|-1)AD 2 )<^ + 1. 

Putting everything together, we see that there are at most 0(|G| 2 /|D| 2 ) + 0(|G|) 
tuples for which x,g,xg,gx are not all distinct. Since |G| > D (as can be seen 
by considering the regular representation of G), the claim then follows if D is 
sufficiently large depending on 5. □ 

In order to prove Theorem [5l we will introduce a new version of the Furstenberg 
correspondence principle adapted to sequences of finite quasirandom groups, which 
is based on the ultraproduct construction used in nonstandard analysis. As with 
the usual correspondence principle, this construction will allow us to deduce the 
combinatorial mixing result in Theorem [S] from a corresponding mixing result in 
ergodic theory. A key feature of the construction is that significant vestiges of the 
mixing property from Proposition [3] are retained in the measure-preserving system 
that one studies on the ergodic theoretic side of the correspondence principle. On 
the other hand, the group that acts in this setting is not expected to be amenable, 
instead behaving more like the free nonabelian group. Fortunately, there is an 
existing tool in the literature for exploiting mixing properties for non-amenable 
groups, namely the machinery of idempotent ultrafilters. We will introduce the 
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necessary definitions in later sections. For more information on ultrafiltcrs and 
their use in ergodic theory see [3j EJ [10] . 

Due to the repeated use of infinitary techniques (including two completely sep- 
arate and unrelated uses of ultrafilters), our arguments do not give an effectiv^f] 
bound on the rate of decay of c(D) as D — > oo. In particular, we do not know if 
one can obtain a polynomial rate of decay in D, in analogy with Proposition [3l 

In order to illustrate the general ultraproduct correspondence principle strategy, 
we also give a significantly simpler and weaker recurrence result which does not 
assume quasirandomness, but only establishes recurrence for the pattern {x, xg, gx) 
rather than {g, x, xg, gx): 

Theorem 9 (Double recurrence). For every 5 > 0, there exists N > and e > 
such that the following statement holds: if G is a finite group of cardinality at least 
N, and A is a subset of G with \A\ > S\G\, then there exists a non-trivial group 
element g £ G such that \A n g~ 1 A n A9 _1 | > e|G| . In particular, there exists a 
non-trivial g £ G and x G G such that x, xg,gx G A. 

We prove this theorem in Section G3 It may be compared with [TTJ Corollary 
6.5], which with the same hypotheses produces a non-trivial g £ G and x G G such 
that x, gx, xg^ 1 G A. 

Actually, we can improve Theorem [§] slightly: 

Theorem 10 (Double strong recurrence). For every S > 0, there exists e > 
such that the following statement holds: if G is a finite group, and A is a subset 
of G with \A\ > S\G\, then there exist at least e\G\ 2 pairs (g,x) G G x G such that 
x, xg, gx G A. 

We give two proofs of this theorem in Section [5] One proof is measure-theoretic 
in nature (and similar in spirit to the ergodic theory methods). The other proof 
is combinatorial, relying on the triangle removal lemma of Ruzsa and Szemeredi 
[29) . However, we do not know how to use such methods to establish Theorem [5] or 
Corollary HI As with the proof of Corollary [H we can ensure that x,xg,gx are all 
distinct if one assumes a sufficient amounl|j of non-commutativity in the group G. 
Of course, if G is abelian, then xg — gx and Theorems [9l [TUl become trivial. 

As we shall show in Section|6j the combinatorial proof of Theorem fTOl generalizes 
to give a multidimensional version: 

Theorem 11 (Multiple strong recurrence). Let k > 1 be an integer. For every 
5 > 0, there exists e > such that the following statement holds: if G is a finite 
group, and A is a subset of G k with \A\ > <5|G| fe , then there exist at least e|G| fc+1 
tuples (g, x\, . . . , Xk) G G k+1 such that 

(gxi, . . .,gxi,x i+ x, . . . ,x k ) G A 

^The most infinitary step in the arguments involve the usage of idempotent ultrafilters, which 
are closely related to Hindman's theorem [22] in infinitary Ramsey theory. It may be possible to 
use some suitable unitizations of Hindman's theorem as a substitute for the tool of idempotent 
ultrafilters to eventually obtain some (incredibly poor) quantitative decay rate for c(D), but we 
do not pursue this issue here. 

^The groups G with few commuting pairs {(g, x) £ GxG : xg = gx} were qualitatively classified 
in |26| . Roughly speaking, the necessary and sufficient condition that G has o(|G| 2 ) such pairs 
is that G does not have a bounded index subgroup whose commutator also has bounded index. 
Note this is a much weaker property than quasirandomness, for which the argument used to prove 
Corollary \S\ may be applied. See also 21 for some more quantitatively precise characterizations 
of groups with many commuting pairs. 
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for aZ@ i = 0, . . . , k, and also 

(gxig" 1 , ■ ■ ■ , gxkg" 1 ) e A. 

Note that the k = 1 case of this theorem, when applied to the set A in Theorem 
[TU1 gives at least e|G| 2 pairs (g, x) such that x, gx, gxg^ 1 £ A; by replacing (x,g) 
by (g'x' , (<?') _1 ) we see that Theorem[in]is equivalent to the k = 1 case of Theorem 

EU 

The analogue of Theorem [5] (and hence Theorem [TP]) can fail for the pattern 
cc, go;, xg if one does not assume quasirandomness. For instance, if G has an 
index two subgroup H, and A is the complement of H in G, then A contains no 
patterns of the form x,g,gx, let alone x, g, gx,xg. However, it is still reasonable 
to hope for a "noncommutative Schur theorem" , namely that if a finite group G 
is partitioned into r color classes A\, . . . , A r , then at least one of the color classes 
Ai has the property that x,g,gx,xg £ Ai for at least c\G\ 2 pairs (x,g) £ G 2 , 
where c depends only on r; in particular, under a suitable hypothesis that G is 
sufficiently noncommutative (in the sense of footnoteU]), we can find x, g such that 
x,g,gx,xg are distinct elements of Aj. We could not verify or disprove this claim, 
but note that an analogous claim in the infinitary setting of countable amenable 
groups was established in [TQ1 Theorem 3.4]. If one replaces the pattern x, g, gx, xg 
by x, g, gx then the claim easily follows from Folkman's theorem [ljj §3-4] applied 
to a randomly chosen (finite portion of an) IP system in G; we omit the details. 

Remark 12. Throughout this paper we shall freely use the axiom of choice. How- 
ever, thanks to a well known result of Godel [18], any result that can be formalized 
in first-order arithmetic (such afl Theorem and is provable in Zermelo-Frankel 
set theory with the axiom of choice (ZFC), can also be proven in Zermelo-Frankel 
set theory without the axiom of choice (ZF). 

1.1. Acknowledgments. The first author acknowledges the support of the NSF 
under grant DMS-1162073. The second author was partially supported by a Simons 
Investigator award from the Simons Foundation and by NSF grant DMS-0649473. 
The authors also thank Robert Guralnick for help with the references. 

2. ULTRAPRODUCTS, (7-TOPOLOGIES, and Loeb measure 

The arguments in this paper will rely heavily on the machinery of ultraproducts, 
as well as some related concepts such as c-topological spaces and the Loeb measure 
construction. The purpose of this section is to review this machinery. 

Given a set S, define an ultrafilter on 5 to be a collection a of subsets of S such 
that the map A h-> lAea, that assigns 1 to subsets A of S in a, and to all other 
subsets, is a finitely additive {0, l}-valued probability measure on S. The set of all 
ultrafilters is denoted (3S. One can embed S in (3S by identifying each x £ S with 
the ultrafilter {A c S : x £ A} (or, equivalently, with the Dirac measure at x). A 
routine application of Zorn's lemma shows that there exist non-principal ultrafilters 
a £ PS\S for any infinite set S. 

^We ignore the block gx\, . . . ,gxi when i = and ignore the block Xi+i, . . . when i = k; 
thus we interpret (gxi , . . . , gxi, aij+i , . . . , aife) as (jdj., . . . , X/,) in the case i = 0, and (gx\ , . . . ,gx^) 
in the case i = k. 

^Strictly speaking, the definition of quasirandomness involves the field C of complex numbers, 
but it is easy to see that one can replace that field by the algebraic closure Q of the rationals, 
which are easily formalized within first-order arithmetic. 
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Now we fix a non-principal ultrafilter a G /3N\N on the natural numbers N. A 
subset of N is said to be a-large if it lies in a. Given a sequence A n of sets indexed 
by all n in an a-large set, we define the ultraproduct Yln-nx A n to be the space 
of all formaQ limits (or ultralimits) lim n _>. Q x n of sequences x n defined and in A n 
for an a-large set of n, with two formal limits lim n ^. Q x n , linin^Q, y n declared to be 
equal if we have x n = y n for an a-large set of n. An ultraproduct Jln-s-a ^ n °^ 
sets A n will be referred to as an internal set. For a single set X, the ultraproduct 
rin^a X 1S called the ultrapower of X and is denoted *A; note that X embeds into 
* X after identifying each x G X with its ultralimit limn^Q, x. Given a sequence 
/n : -^n Y n of functions defined for an a-large set of n, we define the ultralimit 
limn^Q / n to be the function from Iln^a ^ n to Iln^a ^ n defined by 

( lim /„)( lim x n ) := lim f n (x n ) 

n— >a n^ra n >a 

Such functions are also known as internal functions. 

Given an element x — limn-^ x n of the ultrapower *R of the reals, we say that x 
is bounded if < C for some real number C (i.e. |a: n | < C for an a-large set of n), 
and infinitesimal ii\x\ < e for every real e > 0. A well-known Bolzano- Weierstrass 
argument shows that every bounded x G *K can be expressed uniquely as the sum 
of a real number st(ir), referred to as the standard part of x, and an infinitesimal. 
By convention, we define the standard part of an unbounded element of *R to be 
oo. The quantity st linin-^ x n is also known as the a-limit of the x ni and can be 
denoted as a — lim n x„. 

Internal sets X = Iliwa do not have a natural topological structure (other 
than the discrete topology). However, as pointed out in [30], there is a useful 
substitute for this topological structure on such an internal set X , namely the more 
general concept of a a -topological structure. 

Definition 13 (er-topology). [30] A a-topology on a set X is a collection T of 
subsets of X which contains the empty set and whole set X, is closed under 
finite intersections, and is closed under at most countable unions (as opposed to 
arbitrary unions, which is the definition of a true topology). The pair (X, !F) will 
be called a a -topological space. An element of J- will be called countably open, and 
the complement of a countably open set in X will be called countably closed. A 
map / : X — > Y between two er-topological spaces (X, Fx), {Y,Fy) will be called 
continuous if the inverse image of any countably open subset of Y is a countably 
open subset of X, or equivalently if the inverse image of a countably closed set is 
a countably closed set. Similarly, the map / is said to be open (resp. closed) if 
the forward image of any countably open (resp. closed) subset of A is a countably 
open (resp. closed) subset of Y. 

A er-topological space (A, J-) is said to be countably compact if any countable 
cover X C Um=i ^ m °^ ^ ^v countably open sets has a finite subcover, or equiva- 
lently if any collection (-F m ) m gN of countably closed subsets in X with the property 
that any finite intersection of the F m is non-empty, also necessarily has non-empty 
joint intersection f|™ =1 F m . A c-topological space (A, J 7 ) is said to be T-y if every 
point in X is countably closed. 



Note that despite the formal use of the lim notation, no topological structure is required 
on the X n in order to define an ultraproduct. If one prefers, one could view I~[n->a ^ n as * ne 
space of equivalence classes of tuples (x n ) ngy i defined on a-large sets A, with (x n )neA: (Sn)neB 
equivalent if one has x n = j/ n for an «-large set of n. 
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One should view "countably compact Ti" as being to cr-topology as "compact 
Hausdorff" is to ordinary topology. 

We have the following basic compactness theorem, known to model theorists as 
the countable saturation property: 

Lemma 14 (Countable saturation). Let X — Y\. n _+ a -^n be an i n t^ rn( ^ set, and 
let Tx be the collection of all subsets of X that can be expressed as the union of at 
most countably many internal subsets of X. Then (X^Tx) is a countably compact 
T\ a -topological space. 

Furthermore, any internal function f : X — > Y between two internal sets X, Y 
will be continuous, open and closed with respect to these a -topological structures. 

Proof. It is easy to see that Tx is a F\ cr-topology. To verify countable compactness, 
we will use the formulation from Definition [13] involving countably closed sets. 
Expressing each F m as the countable intersection of internal sets, we see that we 
may assume without loss of generality that the F m are internal, thus for each m we 
have F m = Iln^a Fn,m f° r some subsets F n . m of X n . (Strictly speaking, F n , TO is 
initially only defined for an a-large set of n, but we can extend to all n by defining 
F n , m to be the empty set for all other values of n.) 

By hypothesis, f| , F m is non-empty for any fixed M. As a consequence, for 
each such M, we see that Hm=i F n ,m is non-empty for all n in an a-large subset Sm 
of N. By shrinking the Sm if necessary, we may assume that they are decreasing 
in M. For each n € Si, let M n be the largest natural number less than or equal 
to n with the property that n S Sm„, so that C\m=i^n,m is non-empty. By the 
axiom of choice, we may thus find a sequence (x n ) n eSi such that x n £ Hm=i F n , m 
for all n € S\. If we form x := lim n _> Q x a , then we have x £ Fm for all M, since 
Xn £ Fn,M for all n £ Sm- The claim follows. 

Now let / : X —> Y be an internal function. It is clear that / is continuous 
and open. To demonstrate that it is closed, let F be a countably closed subset of 
X, thus F — H^Li Fn for some internal subsets F n of X. Observe that if y £ Y 
lies in f(F n ) for each n, then the internal sets {x £ F n : f(x) = y} have all finite 
intersections non-empty, and hence by countable compactness {x £ F : f(x) = y} 
is non-empty as well. This shows that f{F) — (Xi=i f(Fn), and so / is closed as 
required. □ 

Henceforth we endow all internal sets with the c-topological structure given by 
Lemma 1141 This structure is unfortunately not a genuine topology, as all points 
are internal and thus countably open, but arbitrary unions of points need not be 
countably open. However, it turns out in practice that cr-topological structure can 
serve as a reasonable substitute for genuine topological structure, so long as one 
restricts attention to at most countably many sets at any given time (and provided 
that one works exclusively with sequences rather than with nets). 

We also note the pleasant fact that the standard part function st : *R — > MU{oo} 
is a continuous map from *K (with the a-topological structure) to lU {oo} (with 
the usual topological structure), thus the inverse image of an open (resp. closed) 
set in K U {oo} is countably open (resp. countably closed) in *K. 

Given a finite non-empty set X, we can define the uniform probability measure 
fix on X by the formula 

»x(E) := \E\/\X\. 
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It turns out that this simple measure construction can be extended to ultraproducts 
of finite non-empty sets as well, and is known as the Loeb measure construction: 

Definition 15 (Loeb measure). [25] Let X = Jln^a be an ultraproduct of 
finite non-empty sets A n , and let fix n be the uniform probability measures on each 
A n . Let B x be the Boolean algebra of internal subsets of X, and let Bx be the 
cr-algebra generated by B x . We define the Loeb measure on X to be the unique 
probability measure on Bx with the property that 



is the internal measure of F. 

To verify that Loeb measure actually exists and is unique, we observe from 
Lemma [TJ] that the function fi x defined on B x is a premeasure with total mass 
one, and the claim then follows from the Caratheodory extension theorem (or, more 
precisely, the Hahn-Kolmogorov extension theorem) . 

Remark 16. One can view B x as analogous to the algebra of elementary subsets 
of a Euclidean space (i.e. Boolean combinations of finitely many boxes), with 
Bx as analogous to the cr-algebra of Borel sets (indeed, note that this cr-algebra 
is generated by the countably open sets). One could, if one wished, form the 
completion of Loeb measure by adjoining all sub- null sets to Bx, thus giving a 
measure analogous to Lebesgue measure rather than Borel measure. It will however 
be convenient to avoid working with this completion, as it has poorer properties 
with respect to restriction to measure zero subsets. (This is analogous to how a 
slice of a Borel measurable subset of ]R 2 is automatically Borel measurable in K, 
whereas the analogous claim for Lebesgue measurable subsets is certainly false.) 

It will also be very important to keep in mind that on the product X x Y of two 
ultraproducts of finite non-empty sets, the Loeb measure fixxY is not, in general, 
the product fix x f-Y of the Loeb measures on X and Y; instead, the latter measure 
is a restriction of the former to a much smaller cr-algebra (Bx X By is usually much 
smaller than Bxxy)- In a similar spirit, the cr-topology on X x Y is not the product 
of the cr-topologies on X and Y in general, but is usually a much finer cr-topology. 
We will discuss this important phenomenon in more detail later in this section. 

We record the following pleasant approximation property: 

Lemma 17 (Approximation by internal sets). Let X be an ultraproduct of finite 
non-empty sets, and let E be a Loeb measurable subset of X. Then there exists an 
internal subset E' of X that differs from E by a fix -null set (thus fix(EAE') = 0). 

Proof. As the Loeb cr-algebra is generated by the Boolean algebra of internal sub- 
sets, it suffices to show that the property of differing from an internal subset by 
a /ix-riull set is closed under complements and countable unions. The comple- 
mentation property is clear. To prove the countable union property, it suffices by 
countable additivity of fix to show that any countably open set (J E n (where each 
E n is an internal subset of X) itself differs from an internal subset by a fix-null 
set. We may of course assume that the E n are disjoint. Let p denote the Loeb 
measure of \J n E n , thus p = J2^Li v{E n ). For any n £ N, we clearly can find an 




a 



fi x (F) :=st(*n x (F)) 
F n is an internal subset of A, where 
*fix(F) := lim fixAF n ) € *[0, 1] 
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internal subset S of X which contains (J™^ E n and has internal measure at most 
p H — — ; indeed, one can just take S = U"Li E n itself. This is an internal prop- 
erty of the set S, and so by countable saturation (applied to the internal power set 
Iln-s-a 2 X " of X) we conclude that there exists an internal subset S of X which con- 
tains U"=i En and has internal measure (and hence Loeb measure) at most p + 
for every no G N. In particular, it can only differ from IJ^Lj E n by a ^-null se tj 
and the claim follows. □ 

One of the basic theorems in ordinary topology is TychonofFs theorem that the 
arbitrary product of compact topological spaces is still compact. Related to this is 
the assertion (proven using the Kolmogorov extension theorem) that the product of 
arbitrarily many (inner regular) probability spaces is still a probability space. We 
now develop analogues of these two assertions for a-topological spaces and for Loeb 
measure, which will be needed later in the paper when we wish to apply probability 
theory to a sequence of random variables drawn independently and uniformly at 
random from an ultraproduct of finite non-empty sets. 

We first give the c-topological version of TychonofFs theorem, a fact closely 
related to the property of u\- saturation considered by model theorists. 

Lemma 18 (c-Tychonoff theorem). Let (X a ) a( zA be a family of sets X a indexed by 
an at most countable set A, let Xa '■— YlaeA-^a be the product space, and for each 
I d J a A, let nii-j : Xj — > Xj be the obvious projection map between the spaces 
Xi := YiaeiXa and Xj := T[ ae jX a . Suppose that for each finite subset I of A, 
Xj is equipped with a countably compact T± a -topology Ti, such that the projection 
maps tti^j are all both continuous and closed. Define a cylinder set on Xa to be 
a set of the form TTj^_ A {yi), where Vj is a countably open subset of Xj, and let Fa 
be the collection of all at most countable unions of cylinder sets. Then (Xa-i^a) is 
also a countably compact T\ a -topological space. 

Proof. It is clear that (Xa, Fa) is a a-topological space, so we only need to verify 
countable compactness. As in Lemma [Til it suffices to show that if (E n )'^' =l is 
a sequence of cylinder sets E n — nj <_^(.FY„), where each is finite and Fi n is 

countably closed, with f~)n=i non-empty for every finite M, then fXi=i is 
non-empty as well. 

By increasing each I n if necessary (and using the continuity of the projection 
maps 7r7<_j) we may assume that the /„ are increasing in /, and then by shrinking 
the Fi n we may assume that Fj m C (Fi„) for all m> n. 

We now recursively construct a sequence of points x n € ^ 7 (Ei m ) f° r n = 
1,2,... as follows. To construct x\, observe from the closed nature of the tti^j 
that iCiii-itniFifn) are countably closed non-empty decreasing subsets of Xj t , hence 
by countable compactness fL=i' rj i*--'«i('' i ' J m) ^ s non-empty. We select a point 
X\ arbitrarily from this set. Now assume inductively that n > 1 and that x n -\ 
has already been constructed. Then the sets ni n ^i m (Fi m n nj 1 ({ x n-i})) 
for to > ?i are countably closed non-empty decreasing subsets of Xj 1 (here we 
use the continuous and closed nature of the tti^j, as well as the T\ nature of 
I n -i), and hence by countable compactness we can find x n in the joint intersection 
f1m=n ft-lnt-Im (Fi™ n n 7„_ 1 ^i m ({ x n-i})) is non-empty. By construction, we have 
7T/„^j m (%m) = x n for all m > n. If we then select x £ Xa such that ni n <-A(%) = x n 
for all n, we conclude that x lies in every F n as required. □ 
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Now we turn to product Loeb measures. Let us first consider the problem of mul- 
tiplying together two Loeb probability spaces (X, Bx,^x) and (Y,By,^iy), where 
X = Iln^a ^ = Iln^a ^ n are ultraproducts of finite non-empty sets. We have 
two probability space structures on the product X x Y, namely the Loeb space 
(X x Y, BxxY, Mxxy) and the product space (X x Y, Bx x By, f-x x/jy). ft is easy 
to see that the latter space is a restriction of the former, thus Bx x By C BxxY, and 
Hx x V>y(E) — hxxy(E) whenever E G Bx x 2?y This is ultimately because the 
Cartesian product of two internal sets is again an internal set (identifying Cartesian 
products of ultraproducts with ultraproducts of Cartesian products in the obvious 
manner). On the other hand, not every set which is measurable in BxxY is measur- 
able in Bx x By ; intuitively, the reason for this is that internal subsets of X x Y need 
not be approximable by Boolean combinations of finitely many Cartesian products 
of internal subsets of X and Y (or, at the finitary level, not all subsets of X n x Y n 
can be well approximated by Boolean combinations of finitely many subsets of X n 
and Yn, where the number of such subsets is bounded uniformly in n). Thus, one 
should view the probability space (X x Y, BxxY, ^ixf) asa strict extension of the 
probability space (X x Y, Bx x By, fix x fj,y), or equivalently one should view the 
latter space as a strict factor of the former. 

Despite the disparity between the two factors, we still have the following version 
of the Fubini-Tonclli theorem: 

Theorem 19 (Fubini-Tonclli theorem for Loeb measure). Let X = Yl n ^a -^m ^ = 

rin->a ^ e ultraproducts of finite non-empty sets. Let f be a bounded BxxY~ 
measurable function. Then, for every x G X, the function y i-> f(x,y) is By- 
measurable, and the function x i-> J Y f(x,y) dfj,y(y) is Bx -measurable. Further- 
more, we have the identity 

/ f(x,y) d/j, X xY(x,y) = / ( / f{x,y)dfj, Y {y)) d/j, x (x). 

JXxY JX \Jy / 

Similarly with the roles of X and Y reversed. As a particular corollary, if E is a 
fJ^xxY-null set in X x Y , then for [ix-almost every x G X , the set E x := {y G Y : 
(x,y) G E} is a fiy-null set, and similarly with the roles of X and Y reversed. 

As with the usual Fubini-Tonclli theorem, one can generalize this theorem from 
bounded functions to non-negative or absolutely integrable functions (after exclud- 
ing some null set of X where the Y integral is infinite or divergent), but we will 
not need to do so here. 

Proof. This will be a slight variant of the usual proof of the Fubini-Tonelli theorem. 
By approximating / by simple functions and using linearity, we may reduce to the 
case when / = \e is an indicator function for some E G BxxY\ thus our task is 
now to show that the slices E x :— {y G F : {x,y) G E} arc Z?y-measurable for every 
x, the function x ^ fiy{E x ) is £rjf -measurable, and that 

Vxxy(E) = / Hy(E x ) dfi x (x). 
Jx 

By the monotone class lemma, it suffices to show that the set of E in BxxY obey- 
ing these conclusions is closed under upward unions, downward intersections, and 
contains the algebra B XxY of internal subsets of X x Y. The first two claims fol- 
low from several applications of the monotone convergence theorem in the three 
probability spaces (X x Y, BxxY, Hxxy), (X, Bx,^x), and (Y, By,ny). So we 



MULTIPLE RECURRENCE IN QUASIRANDOM GROUPS 



13 



may assume that E is an internal subset of X x Y, thus E = Yln-ya ^ n wnere 
E n C X n x Y n for an a-large set of n. Since 

E x = \\ {E n )x n 

n-fa 

whenever x — lim n _ i . Q , x n , we see that E x is internal and thus By -measurable for all 
x E X . Also, from @ we have hy{E x ) = st(g(x)), where g is the internal function 

g(x) := *fj, Y (E x ) = lim Trr-, |{?/n € Y n : (x n ,y n ) E E n }\. 

From the trivial Fubini-Tonelli theorem for finite sets, we have 

TtU 9( X n) = VX n xYjE n ). 

Taking ultralimits, and approximating g from above and below by simple functions, 
we see that 

/ g(x) d^ix(x) = Hxxy(E) 
Jx 

and the claim follows. □ 

We can now construct a Loeb product space with infinitely many factors as 
follows. 

Theorem 20 (Loeb product spaces). Let A be an index set (possibly countable or 
even uncountable) , and let (X a )a^A be a family of internal sets X a , with each X a 
being the ultraproduct of finite non-empty sets. Let Xa '■— Yia&A ^ e ^ e product 
space, and for each finite subset I of A, let 717 : Xa — > Xj be the projection to the 
space Xj :— J\ a ^j X a . Let Bx A denote the a-algebra generated by the pullbacks 
7r7 (Bxj) '■= { 7 ^J 1 (Ei) '■ Ej E Bxj} for all finite subsets I of A; equivalently, 
Bx A * s generated by the cylinder sets from Lemma \18l Then there exist a unique 
probability measure [ix A on Bx A whose pushforward measures (tti)*[J,x a agree with 
\iiXz for each finite subset I of A, thus 

(3) ^(TTf 1 ^)) = fX Xl (Ej) 

for all Ej E Bxi ■ 

Proof. By Lemma[l4]and LemmaUHl the cylinder sets on Xa generate a countably 
compact Xi er-topology on Xa- Hence the function \ix A defined on the boolean 
algebra of cylinder sets by ([3|) is a premeasure of total mass 1. The claim then 
follows from the Caratheodory extension theorem (or Hahn-Kolmogorov extension 
theorem). □ 

Remark 21. We will refer to the probability space (Xa, Bx A , Hx a ) generated by 
Thcorem[2Q]as the Loeb product space on Xa- In general, it will be a strict extension 
of the usual product probability space (X, Y\ aeA ^a, YiaeA Ma)- If we use this Loeb 
product space as the sample space for probability theory, then the coordinate pro- 
jections from Xa to each factor space X a can be interpreted as a family (x a ) a eA of 
random variables, with each finite subtuple (x a ) a ei being distributed with the law 
of the Loeb probability space (Xj,Bj, /ij). In particular, for any Bx x -measurable 
set Ej, we have 

P((s,)a6i e Ej)=nxt(Ej). 
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This property is stronger than joint independence of the x a , because the cr-algebra 
Bx 1 is significantly larger than the product cr-algebra Yiati @ x a ■ 

3. Proof of Theorem 03 

We now prove Theorem 02 whose proof is simpler than that of Theorem 03, but 
already illustrates the key strategies used in that latter proof, in particular the 
use of an ultraproduct correspondence principle to reduce the problem to an er- 
godic theoretic one. More precisely, we will deduce Theorem 03 from the following 
recurrence result: 

Proposition 22 (Double recurrence). Let G be an infinite (and possibly uncount- 
able) group that acts on a probability space (X,/i) by two commuting measure- 
preserving actions (L s ) g6 G, {Rg)g&G (thus fi(L g E) = jjb{R g E) = fJ.(E), L g Lh = 
L g h, RgRh = Rgh, and L g Rh = L^Rg for all g,h 6 G and measurable E C X), 
and let A be a subset of X of positive measure. Then there exists a non-trivial 
group element g € G such that 

/J,(A LgA H LgRgA) > 0. 

In particular, A D L g A n L g R g A is non-empty. 

Proof. Since every infinite group contains a countably infinite subgroup, we may 
assume without loss of generality that G is countable. The claim now follows from 
[TO] Theorem 1.5] (note carefully that this result does not require G to be amenable). 
In fact, that theorem yields the stronger result that there exists A > for which 
the set {g : /j,(AD L g APi L g R g A) > A} is both left-syndetic and right-syndetic (and 
is even central* and inverse central*; see [TU] P- 1256] for definitions). □ 

Now we can begin the proof of ThcorcmOU In order to emphasise the relationship 
with the ergodic theorem in Proposition \22\ we introduce the uniform probability 
measure /iq on a finite group G, thus 

H G (E) := \E\/\G\ 

for all E C G, and L 2 (G) = L 2 {G,^g)- We also introduce the left-shift L g and 
right-shift R g actions on G by the formulae 

(4) L g x :— gx; R g x :— xg^ 1 ; 

these are commuting actions of G on itself. They induce the associated Koopman 
operators on L 2 (G) by the formulae 

L g f(x) : ./;// '.r; 

and 

R gf( x ) ■= f( x 9), 

which are of course unitary operators. Observe that for any g G G, we have 

fi G {{x G G : x, xg, gx G A}) = t i G {A n L g A n L g R g A) 

and so our task is to show that for any S > there exist N, e > such that if G 
is a finite group with |G| > N and A is a subset of G with hg{A) > 5, then there 
exists a non-trivial g <E G such that 

fi G {A n LgA n LgRgA) > E. 



MULTIPLE RECURRENCE IN QUASIRANDOM GROUPS 



15 



Suppose for sake of contradiction that the claim failed. Carefully negating the 
quantifiers, and using the axiom of choice, we may then find a 6 > 0, a sequence 
G n of finite groups for each n G N := {1,2,3,...} with |G„| > n, and subsets A n 
of G n , with the properties that 

(5) VG n (An)>6 

and 

(6) A»G„(4 n n n L nj g n R n .g n A n ) < — 

for all non-trivial g n € G n , where we use £ n ,g n , Rn,g n to denote the left and right 
actions for G n . 

Fix all the above data G n ,A n ,5, which we can view as a sequence of finitary 
"approximate counterexamples" to Theorem [9] The next step is to use an ultra- 
product construction to pass from this sequence of approximate counterexamples to 
a genuine counterexample to Theorem [9] and obtain the desired contradiction. Ver- 
sions of this "compactness and contradiction" strategy of course appear in many 
arguments, including some versions of the Furstenberg correspondence principle 
(see e.g. [2], [!]); see also [30] for a construction closely related to the one used 
here. One could also formalize the arguments here in the language of nonstandard 
analysis, but we will avoid doing so in this paper in order to reduce the possibility 
of confusion. 

As in the previous section, we fix a non-principal ultrafilter a € /3N\N on 
the natural numbers. We may now form the ultraproducts G :— Iln-s-a an d 
A := Yl n ^ a A n . As t ne G n were all groups, the ultraproduct G is also a group 
(with the group and inversion operations being the ultralimits of the associated 
operations on G n ). On the other hand, as each G n had cardinality at least n, we 
easily verify that G has cardinality at least n for each n, so that G is now an infinite 
group (indeed, it will necessarily uncountable, since it is countably compact thanks 
to Lemma [T4|) . The set A is of course a subset of G. Our objective is to use this 
data to build a counterexample to Theorem [S] 

Let (X, X, //) := (G,Bg, Hg) be the Loeb probability space associated to G. 
(We will use two different symbols G, X to describe the same object here, in order 
to emphasise the conceptual distinction between the underlying space X, and the 
group G that acts on that space.) It is easy to see that the left and right actions 
(L g ) ge G, (Rg)geG are measure-preserving actions on X, thus for any j 6 G, the 
maps E i — y L g E and E i— > R g E are measure-preserving on (X, X ', fi). Note however 
that (g, x) i— > L g x and (g, x) i— > R g x are jointly measurable as maps from G x X to 
X only if one uses the Loeb product a-algebra Bqxx on G x X, rather than the 
product Loeb cr-algebra Bq x Bx- To avoid technical issues associated to this, we 
will not perform any operation (e.g. integration in G rather than in X) that would 
require joint measurability of the actions. 

Remark 23. The Loeb probability measure fie on the group G is closely analo- 
gous to a Haar probability measure on a compact group, with the main difference 
being that G is only a (countably) compact group with respect to a ^-topological 
structure, rather than a genuinely topological structure. (For instance, it is easy to 
see that the group operations g H ► g -1 , (g, h) i— > gh are continuous with respect to 
the cr-topological structures on G and G x G.) 
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The set A is an internal subset of X and is hence measurable in (X, X, /i). From 
flSJ, © we see that 

fi(A) > 5; 

in particular, A has positive measure. Applying Proposition 1221 we can thus find a 
non-trivial element g of G such that 

fx(A n L g A n L g R g A) > e 

for some e > 0. Now, if we write g = limn^Q, g n , then g n is non-trivial for an 
a-large set of n. Furthermore, from ([2]) we have 

fi(A n L g A n LgRgA) = st lim fi Gn (A a n L n , 9 „ A„ n L n , ffn i? niSn A„) 
and thus 

MG„(^n n £ n , 9n Ai n L„,g„i?„,g„A n ) > e 

for an a-large set of n. But this contradicts ([6]) (note that as a is non-principal, 
any a-large subset of N contains arbitrarily large elements). This contradiction 
concludes the proof of Theorem [9l □ 

Remark 24. It is also possible to replace the use of ultraproducts in the above 
argument with applications of the compactness and completeness theorems in first- 
order logic instead, to obtain a countably saturated modejfl (G, A, /i, L, R) of a 
group G and set A that formally lies in a space X with a probability measure p, 
and commuting actions (L g ) sg G) (Rg)g£G, such that this model is a limit of the 
unitary models (G n , A n , /ig„, L n , R n ) in the sense that any statement in first-order 
logic which holds for all but finitely many of the unitary models, is also true in the 
countably saturated model. We leave the details to the interested reader. 

4. An ergodic theorem 

We now begin the proof of Theorem O which follows a similar strategy to that 
of Theorem |9] but with some additional complications. In particular, we will need 
to replace the multiple recurrence theorem in Proposition [22] with a convergence 
theorem which, due to the inherent independence properties of quasirandom groups 
in our application, takes the shape of a relative weak mixing result along a properly 
chosen IP system. To prove this theorem, it will be convenient to use the machinery 
of idcmpotent ultrafiltcrs and their associated limits. 

We turn to the details. Given a group G, define an IP system in G to be a selQ 
of the form 

H = {g h . . .g ir : r > 1; 1 < i x < h < . . . < i r } 
where gx,g%, ■ ■ ■ are an infinite sequence of elements in G (not necessarily distinct). 
Inside such an IP system, we can form the sub-IP system 

H n = {g n . . . g ir : r > 1; n < ii < i 2 < ■ ■ ■ < i r } 

for any natural number n. Given a sequence {x g )g^H of points in a Hausdorff 
topological space Z indexed by H and a point x € X, we say that x g converges 

° A model is countably saturated if, whenever one has a countable family of sentences Si, S2, ■ ■ ■ 
with the property that any finite number of these sentences are simultaneously satisfiable, then 
the entire family is simultaneously satisfiable; this property is the model-theoretic analogue of 
Lcmma ll4l 

^Strictly speaking, the IP system should be a tuple consisting of the set H together with the 
generators g\,g2,- ■ ■, but we shall abuse notation and refer to the system simply by the set H. 
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along H to x, and write H — \im g x g = x, if for every neighborhood V of x, there 
exists n such that ir g € V for all 5 € H n . 

The variant of Proposition [25] that we will need is 

Theorem 25. Let (X, X, fj,) be a probability space, and let G be a group. Let 
(L g )g£G and {R g ) g ^G be two measure-preserving actions ofG on X , which commute 
in the sense that L g Rh = RhL g for all g,h € G. Define L g f := / o L^ 1 and 
Rgf '■= f R^ 1 for f € L°°(X, /i). Let H be an IP system in G, and fx, f%, f§ be 
elements of L°°(X, fi). Assume the following mixing properties: 

(i) (Left mixing) For any f, f £ L 2 (X, X, (i), one has 

Jf-lim f fL g f dn = ( f f d/i)( / /' dn). 
9 Jx Jx Jx 

(In particular, we assume that this H -limit exists.) 

(ii) (Right mixing) For any f,f E L 2 (X,X,pi), one has 

F-lim f fRgf dn = (f f dfi)( f f dfj,). 

3 Jx JX Jx 

(hi) (fs orthogonal to diagonally rigid functions) Let f G L 2 (X,X,\i) be any 
function with the rigidity property that, for any e > and for any natural 
number n, there exists g € H n such that \\L g R g f — f\\L 2 {x.x < £ - Then 

fxffs <h> '). 

Then for any e > and natural number n, there exists g G H n such that 

I / fx(L g f 2 )(L g R g f 3 ) d/i\ < e. 
Jx 

Note that we allow G to be uncountable. However, observe that we may without 
loss of generality restrict G to the group generated by H , so we may assume without 
loss of generality that G is at most countable. Note also that the left and right 
mixing properties are assumed to apply to all L 2 functions /, /', not just the three 
given functions fx, f%, /3, but the diagonal mixing property (or more precisely, the 
orthogonality to diagonally rigid function property) is only imposed for the specific 
function fy. In our application, we cannot impose diagonal mixing for arbitrary 
functions, because of the non-ergodicity of the conjugation action (any subset of a 
group G which is a union of conjugacy classes is clearly invariant with respect to 
conjugation). 

To prove Theorem 1251 we will use the too|3 of idempotent ultrafilters. We stress 
that despite some superficial similarities, these ultrafilters are unrelated to the non- 
principal ultrafilter a used in the ultraproduct correspondence principle, and are 
used for completely different purposes. 

We first recall the definition of an idempotent ultrafilter. See O El [6] for some 
surveys on idempotent ultrafilters and their uses in ergodic theory. 

Definition 26 (Idempotent ultrafilter). Let G be an at most countable group. 
Given an ultrafilter p £ j3G, define the product ultrafilter p ■ p by requiring that 

1( ^It should also be possible to prove Theorem [25] without idempotent ultrafilters, by replacing 
the notion of a p-limit with that of an IP-limit. But then one would need to repeatedly invoke 
Hindman's theorem [22) as a substitute for the idempotent property, which would require one to 
continually pass to IP subsystems of the original IP system. We will not detail this approach to 
Theorem 1251 here. 
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A 6 p ■ p if and only if Ag 1 is p-large for a p-large set of g. (Recall that a subset 
of G is p-large if it lies in p.) We say that the ultrafilter p is idempotent if p • p = p. 

We have the following basic existence theorem: 

Theorem 27 (IP systems support idempotent ultrafilters). Let H be an IP system 
in an at most countable group G. Then there exists an idempotent ultrafilter p on 
G supported by every H n (i.e. H n is p-large for all n). 

Proof. See [6l Theorem 2.5] (the proof there is stated for actions of N, but the 
argument is general and applies to arbitrary groups or semigroups G). In [33], this 
result is attributed to Galwin. □ 

We will need the notions of p-limit and IP-limit. If p is an ultrafilter on G, 
(x g )g£H is a tuple in a Hausdorff topological space Z indexed by a p-large set H, 
and a; is a point in Z, we say that x g converges along p to x, and write p— lim g x g = x, 
if for every neighborhood V of x, the set {g G H : x g £ V} lies in p. 

Note that if H is an IP system, and p is an ultrafilter that is supported by H n for 
every n, then convergence along H implies convergence along p, thus if (x g ) g £H in 
a Hausdorff topological space Z indexed by H and x £ H, then if H— lim g x g = x 
then p—\im g x g — x; conversely, if p — lim g x g = x, then for every n and every 
neighborhood V of x there exists g £ H n such that x g € V. 

Given a unitary action (U g ) gl £G of an at most countable group G on a Hilbert 
space W, and an idempotent ultrafilter p £ j3G, we say that an element / of W is 
rigid with respect to the (U g ) ge a action along p if one has p—\im g U g f = f in the 
weak topology of W. We will need the following ergodic theorem for idempotent 
ultrafilters: 

Theorem 28 (Idempotent ergodic theorem). Let (U g ) g ^G be a unitary action of 
an at most countable group G on a Hilbert space W , and let f £ W , and let p be an 
idempotent ultrafilter on G. Then p— lim g U g f exists in the weak topology ofW and 
is equal to P f , where P is the orthogonal projection to the closed linear subspace 
{/' : p— linig Ugf = /'} of W consisting of functions that are rigid with respect to 
the (U g )g£G action along p. 

Proof. See [10l Theorem 2.4]. We remark that a related theorem (under the ad- 
ditional hypothesis that the idempotent ultrafilter p is minimal) was established 
in [3 Corollary 4.6]. In the minimal idempotent case we also have the additional 
property that P commutes with the U g ; see p~Ql Theorem 2.4]. However, we will 
not need this additional fact in our arguments here. □ 

Finally, we will need the following version of the van der Corput lemma for 
idempotent ultrafilters. 

Theorem 29 (Idempotent van der Corput lemma). Let {U g ) g ^G be a unitary action 
of an at most countable group G on a Hilbert space W , and let p be an idempotent 
ultrafilter on G. If {f g ) g eG is a bounded family of vectors in W with the property 
that 

p-limp-lim(fg h) f g )w = 0, 
h g 

then 

p-limfg = 

g 

in the weak topology. 
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Proof. See [23 Theorem 2.3]. □ 

We have enough machinery to prove Theorem 1251 

Proof. (Proof of Theorem l25[) As discussed previously, we may assume without loss 
of generality that G is at most countable. By Theorem [571 we mav find an idem- 
potent ultrafilter p € (3G which is supported by H n for every n. From hypothesis 
(i) we see that for all /' e L 2 (X, X, p), we have 

H-limL g f' = f f d/i 

9 Jx 

in the weak topology of L 2 (X, X, fj), and hence 

(7) p-\imL g f = f f 

9 Jx 

in the weak topology of L 2 (X, X, /j) also. Similarly, from (ii) we have 

(8) p-]imR g f= f f dp 

9 JX 

in the weak topology of L 2 (X, X, pi) for all /' G L 2 (X,X,p). Next, if/ € L 2 (X,X,p) 
is rigid with respect to the (L g R g ) g( zQ action along p, thus 

p-lim LgRgf = f, 

a 

in the weak topology, and in particular 

p-\im{f, L g R g f) L 2 (x x ^ = \\f\\ 2 L 2 (X)Xtll y, 

on the other hand, by the parallelogram law and the unitary nature of L g R g we 
have 

11/ _ L a R gf\\L 2 (X,X^) = 2 \\f\\h(X,X 4 i) - 2 (/> L gRgf)L^(X,X^) 

and thus 



and so we have 



p-lim||/ - L g R g f\\ L 2 {x ,x,n) = 



p-lim LgRgf = /, 
9 



in the strong topology also. In particular, 



H-]imL g R g f = /, 

g 



By hypothesis (hi), this forces / to be orthogonal to /3, thus we have 
(9) / fhdn = 



Jx 

whenever / £ L 2 (X, X, (j) is rigid with respect to the (L g R g ) ge c action along p. 
To establish the theorem, it will suffice to show that 



p-lim / fi{Lgf 2 )(.LgRgf 3 ) d^i = 0, 

9 Jx 



or equivalently that 
(10) 

in the weak topology. 



(10) p-\im{L g f 2 ){L g R B h) = 

g 
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Let us first consider the case when f% = 1, that is we will show 

(11) p-lim / h{L g R g h) dp = 0. 

By Theorem [28j the left-hand side of (fTU|) is equal to J x /1P/3, where P is the or- 
thogonal projection onto the functions that are rigid with respect to the (L g R g ) g ^G 
action along p; but from ((9} we have P/3 = 0. This establishes (flTj) . 

By linearity, we may now reduce to the task of establishing (|10p when /2 has 
mean zero: 

(12) f f 2 dp = 0. 

To handle this case, we apply Theorem [29] with W := L 2 (X, X,p) and / 9 := 
{Lgf^jiLgRgfz), we see that to show (flU)) . it will suffice to show that 

(13) p-limp-lim / (L gh f2){L gh R gh f 3 )(L g f2)(L g R g )f 3 dp, = 0. 

h 9 Jx 

We may rearrange the left-hand side (using the commutativity of the L and R 
actions) as 

p-Kmp-lim / {f2Lhh)Rg{hLhRhh) dp. 
h 9 J x 

Applying ([5]), we may simplify this as 

p-lim( / hL h h dp)( \ f 3 L h R h f 3 dp). 
h Jx Jx 

From ©, ([11]) we have 

p-lim / hLhh dp — ( I h dp){ \ h dp) = 0. 



x Jx Jx 

Since /„ f3LhRhf3 dp is bounded in h, the claim (fT"3|) follows. □ 

Remark 30. An inspection of the above argument reveals that we have actually 
proven an idempotent ultrafilter version of Theorem 1251 in which the IP system 
H is replaced by an idempotent ultrafilter p, the notion of an iJ-limit is replaced 
by a p-limit, the rigidity property in Theorem I25f iii) is replaced by the hypothesis 
that p — lim. g L g R g f = f (in the strong L 2 topology), and the conclusion is that 

P~ lim 9 Ix h( L gh){ R gh) d V = °- 

5. Ultra quasirandom groups 

Throughout this section, we fix a non-principal ultrafilter a E /3N\N. 

In order to prove Theorem [SJ we will follow the proof of Theorem [9] and perform 
an ultraproduct of a series of proposed counterexamples to Theorem El In the 
course of doing so, we will be studying ultraproducts of increasingly quasirandom 
groups. It will be convenient to give a name to such an object: 

Definition 31 (Ultra quasirandom group). An ultra quasirandom group is an ul- 
traproduct G = Iln^a °f nn it e groups with the property that for every D > 0, 
the groups G n are D-quasirandom for an a-large set of n. (Informally: the quasir- 
andomness of the G n goes to infinity as n approaches a.) 

To give an example of an ultra quasirandom group, we recall the following clas- 
sical result of Frobenius: 
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Lemma 32 (Frobcnius). Let F be a finite field of prime order p, then the group 
SL2(F) of 2 x 2 matrices of determinant 1 with entries in F is ^--quasirandom. 

Proof. We may of course take p to be odd. Suppose for contradiction that we 
have a non-trivial representation p : SL2(F p ) — > Ud(C) on a unitary group of some 
dimension d with d < . Set a to be the group element 

a:= (o l)' 

and suppose first that p(a) is non-trivial. Since a? = 1, we have p(a) p = 1; thus all 
the eigenvalues of p{a) are p th roots of unity. On the other hand, by conjugating 
a by diagonal matrices in SL,2(F p ), we see that a is conjugate to a m (and hence 
p(a) conjugate to p{a) m ) whenever m is a quadratic residue mod p. As such, the 
eigenvalues of p(a) must be permuted by the operation x i— > x m for any quadratic 
residue mod p. Since p(a) has at least one non-trivial eigenvalue, and there are 
distinct quadratic residues, we conclude that p(a) has at least 2z_ distinct 
eigenvalues. But p(a) is a d x d matrix with <i < 2^—, a contradiction. Thus a 
lies in the kernel of p. By conjugation, we then see that this kernel contains all 
unipotent matrices. But these matrices are easily seen to generate SL2(F p ), and 
so p is trivial, a contradiction. □ 

Thus, if p n is any sequence of primes going to infinity, and F is the pseudo-finite 
field F := rin^a F Pn , then SLi2(F) will be an ultra quasirandom group. 

Let G be an ultra quasirandom group, then we have the Loeb measure space 
(G, Bg, t 1 )- The mixing property of finite quasirandom groups from Proposition [3] 
is then reflected in ultra quasirandom groups as follows: 

Lemma 33 (Weak mixing). Let G be an ultra quasirandom group, and let A, B € 
Bg be Loeb measurable subsets of G. Then for pG-o^ m °st every g G G, we have 

p G (AnL g B)= pg(A)pg(B) 

and 

p G {AnR g B) =p G {A)p G (B). 

In a similar spirit, if f,f G L 2 (G,Bg, Pg)> we have for pG~ a ^ m ost every g € G 
that 

[ fL g f dp G = ([ fdnaXf f dp G ) 
Jg Jg Jg 

and 

f fR g f dp G = (f fdp G )([ f dfi G ). 
Jg jg Jg 

Proof. By approximating L 2 functions by simple functions, we see that the latter 
two conclusions are consequences of the former two. We will just prove the first 
claim, as the second claim is similar. From Lemma [T7] and a routine limiting 
argument, we see that to establish the lemma, it suffices to do so in the case when 
A, B are internal sets, thus A = Jln^a A n and B = Y\ n ^ a B n . 

Let e > and D > 0. By hypothesis, G„ is D-quasirandom for n sufficiently 
close to a. By Proposition [3] and Markov's inequality, we have that 



\HG n (An n L gn B n ) - PG n (An)HG n ( B n)\ < £ 
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for all g n in G n \E n , where E n is an exceptional set with HG„(E n ) < £ X D 1 ' 2 . 
If we set E := [] n ^ (1 £„, then on taking ultralimits we see that E is an internal 
subset of G with fi(E) < e~ 1 D~ 1 / 2 , and that 

\HG{AnL g B)-HG{A)nG{B)\ <e 

for all g outside of E. Letting D go to infinity, and then letting e go to zero, we 
obtain the first conclusion of the lemma as desired. □ 

In what follows, we would like to introduce a sequence gi, g 2 , <73, • • • of elements 
drawn uniformly and independently at random from the ultra quasirandom group 
G. One can of course model this system of random variables rigorously by using 
the standard product space 

(G,S g , M g) n = (G n , 

However, the product u-algebra Bq will be far too small to measure events involving 
products of two or more of the gi, and will therefore be useless for our applications. 
We will thus need to invoke Theorem [20] to construct an extension of the standard 
product space which can handle finite products of the gi. 

We turn to the details. Let G be an ultra quasirandom group. We construct 
a sequence gi, g 2 , gs, ■ • ■ € G of random variables in G, whose joint distribution 
(<?a)agN is defined by the coordinate functions on the Loeb product space on G N — 
HI n G as discussed in Remark 1211 In particular, we have 

(14) P((g a ) k a=1 e E) = » Gk (E) 

for any natural number k and any Kg*; -measurable set E. We can then form the 
random IP system 

H := {g h . . . g ir : r > 1; 1 < i\ < i 2 < ■ ■ ■ < i r } 

and its sub-IP systems 

H n := {g tl ■■■gi T : r > 1; n < h < i 2 < ■ ■ ■ < i r } 

for any natural number n. 

We now investigate the mixing properties of this random IP system. 

Proposition 34 (Inclusion in a given set). Let G, ((? a )aeN, H be as above. Let E 
be a (deterministic) Loeb measurable subset of G. Then for any k € N, the event 

(15) {g n . . .g ir : r > 1; 1 < i x < h < . . . < i r < k} C E 
occurs with probability exactly n(E) 2 -1 . 

Proof. We induct on k. The case k = is trivial, so suppose k > 1 and the claim 
has already been proven for k— 1. We observe that the event (fl5)) is the intersection 
of the events 

(16) g k e E 

and 

(17) {g it . . . g ir : r > 1; 1 < i x < i 2 < . . . < i r < k - 1} C E n R g „E. 
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The event (fl6|) occurs with probability n(E). By Lemma EH we almost sureljO 
have 

(18) ^(EnR 9k E)=^E) 2 . 

If we replace the random variable gt by a deterministic element of G that obeys (IT8l) , 
then by the induction hypothesis, the event (|T7| would then occur with probability 
(n{E) 2 ) 2 _1 . Applying the Fubini-Tonelli theorem (Theorem [T9|). we conclude 
that (fT5j) occurs with probability ^i(-B) x ([i(E) 2 ) 2k 1-1 = [i(E) 2,L ~ 1 , as desired. □ 

Remark 35. The same argument shows in fact that the random variables g^ . . . g\ T 
for r > 1 and 1 < *i < • ■ • < V are jointly independent and uniformly distributed 
in G, provided that one works with the ordinary product er-algebra of all the copies 
of Bq, rather than with the Loeb product cr-algebra constructed in Theorem BUI 
we omit the details. (The Unitary version of this assertion is already implicit in the 
work of Gowers [19].) 

We can now demonstrate almost sure IP-mixing of H when applied to sets that 
only depend on finitely many of the generators of H: 

Lemma 36 (Almost sure left and right mixing). Let G, (.g a )aeN,ff be as above. 
Let n be a natural number, and let E = E gu ... tgn , F — -F sil ..., 9r( be Loeb measurable 
subsets of G that depend in some jointly Loeb measurable fashion on g%, . . . , g n (but 
do not depend on g n +i, gn+2, ■ ■ ■), in the sense that the sets 

{(x,gi,...,g n ) £GxG":ie Fgi,...,g n } 

and 

{(x,gi,...,g n ) eGxG" : x e F ffl ,..., ff „} 
are BGxG n - me ®surable. Then almost surely, one has 

Hg{E n L g F) = hg{E)hg{F) 

and 

li G (EC\R g F) = fi G (E)fi G (F) 

for every g £ H n+1 . 

Proof. As H n+ i is at most countable, it suffices to verify the claim for g = g^ . . . gi r 
for a single choice of r > 1 and n < i\ < . . . < i r . By the Fubini-Tonelli theorem 
(Theorem [T9|). it suffices to prove the claim after replacing (or conditioning) the 
random variables g\ , . . . , g n € G with deterministic elements of G (note that this 
does not affect the joint distribution of gi t , . . . ,gi r , thanks to Fubini-Tonelli). But 
since r > 1, we see that g = g^ . . . gi r is uniformly distributed in G (in the sense 
that it has the law of /iq on Bq, thus P(g € A) = hg{A) for all A g Bg)', this can 
be verified by reducing to the case of internal sets and then verifying the analogous 
fmitary claim for products g n = gi 1<n . . -gi ri n of uniformly distributed independent 
random variables g^ n , . . . , gi r . n on G n . The claim now follows from Lemma 1531 □ 

We can now almost surely obtain the mixing properties (i), (ii) needed for The- 
orem [^SJ 

^Given that our sample space is not complete, it may be worth clarifying that we say that a 
statement holds almost surely if there is a measurable event in the sample space of probability 1 
for which the statement holds, allowing for the possibility that the statement might also hold on 
some (possibly non-measurable) subset of the complementary null set. 
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Corollary 37 (Almost sure left and right mixing, again). Let G, (g a )aeN, H be as 
above. Let Xq be a separable (i.e. countably generated) sub-a -algebra of B G that is 
deterministic (i.e. it does not depend on any of the g a ), and let X be the random 
cr-algebra generated by Xq and the shifts L g ,R g for g £ H (that is to say, X is 
the smallest a-algebra containing Xq which is preserved by the L g ,R g for g £ H). 
Then almost surely, one has the mixing property 

(19) H-]imfx G {EnL g F)=fi G (E)fi G {F) 

9 

and 

(20) tf-lim MG (£ n R g F) = fi G (E)fi G (F) 

9 

for all E, F £ X . In particular, by the usual limiting argument, almost surely one 
has that 

ff-lim f fL g f dnG = ([ f d/x)( / /' dft) 

9 J G J G JX 

and 

H-\im f fR g f d i i G = {( f dfx){ f f dfx) 

9 J G J G Jx 

for all f,f £L 2 (X,» G ). 

Proof. By hypothesis, Xq can be generated by some countable sequence Ei, E%, . , . 
of deterministic, Loeb measurable subsets of G. Then X is generated by the count- 
able family of random sets L g RhE n , where g,h £ (H) lie in the group generated by 
H, and n £ N. Thus, any measurable set in X can be approximated to arbitrary 
accuracy (with respect to p, G ) by a Boolean combination of finitely many of the 
LgR^En. There are only countably many such Boolean combinations. By a limit- 
ing argument, it thus suffices to establish the mixing properties (fT9|) , (|20l) almost 
surely for E and F each equal to a fixed such Boolean combination. But the claim 
then follows from Lemma l36l since such Boolean combinations depend (in a jointly 
Loeb measurable fashion) on only finitely many of the <7i, $2, S3, • ■ •■ D 

Now we turn to the trickiest property we need for Theorem I25[ namely the 
diagonal mixing property (iii) . Let I G be the sub- cr-algebra of B G generated by the 
conjugation invariant functions (or sets). We need a technical relationship between 
this algebra and the associated algebras l Gn of the Unitary groups G n : 

Lemma 38. For each n, let / n : G n — > [— 1, f] be a function. Then one has 

E((st lim U)\1 G ) = st lim E(/ n |I G J 

n— >a n— >a 

/i G - almost everywhere. 

Proof. The function stliixin^Q E(/ n |lG n ) is clearly invariant under conjugation by 
elements of G. It thus suffices to show that the function 

/ := st lim /„ - st lim E(/ n |T G J 

n— >ot n— >a 

is invariant to all Immeasurable bounded functions. 

Let F : G — » [— 1, 1] be I G -invariant. Then by conjugating x by an arbitrary 
group element h £ G, we have 

/ F(x)f(x) dfx G (x) = f F{x)f{hxh~ x ) dfi G (x) . 

JG JG 
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Integrating in the h variable and using the Fubini-Tonelli theorem, we see that 

/ F(x)f(x) dn G (x) = f F(x)( f Hhxh- 1 ) dfi G (h)) dn G {x). 
Jg Jg Jg 

It will thus suffice to show that 

/ /(hxh- 1 ) d[i G (h)=0 
Jg 

for any ieG. But we can write / = st lim n _> Q /„, where 

/„ := /n - E(/ n |I G J 
and direct calculation shows that 

/ /n^n^n^n 1 ) dfl Gn (h n ) = 

for any x n e G n , and the claim then follows from taking ultralimits (and approxi- 
mating / n above and below by simple functions). □ 

Lemma 39 (Almost sure relative diagonal mixing). Let G, (g a )aGN, H be as above. 
Let n be a natural number, and let f = / Sl ,..., Sn : G — > [—1, 1] be a Loeb measurable 
function that depends in a jointly Loeb measurable fashion on the random variables 
gi,...,g n (but do not depend on g n +ii9n+2, ■ ■ in the sense that the function 
(x, g\, . . . , g n ) f gi ,...,g n (x) is a measurable function from G x G n to [—1,1]. Then 
almost surely, one has the identity 

(21) ||/ - L g R g ff L2{GtBGifiG) = 2||/ - E(/|I g )||| 2(GiBg;Mg) 

for all g G H n+1 . 

Proof. As H n+ i is countable, it suffices to establish this claim for a single g = 
g il . . . g ir . By approximation, we may also assume that / is a simple function, 
that is to say a finite linear combination of indicator functions of internal subsets 
of G, and in particular is an internal function / = lim n ^ Q / n . By the Fubini- 
Tonelli theorem, we may replace the random variables <?i , . . . , g n by deterministic 
elements of G, thus making / deterministic, without affecting the joint distribution 

Of ffn+li Qn+2, — 

We now compute the random variable 

8 ■= 11/ - L gRg.f\\Li(G.,B G ,HG)- 

We can expand 

5 2 = f \f(x)-f(g- 1 xg)\ 2 dn G (x). 
Jg 

Conjugating x by an arbitrary group element h € G, then averaging in h, we 
conclude (either by the Fubini-Tonelli theorem, or by taking ultralimits from the 
unitary case) that 

5 2 = f Ifih-'xh) - fig-'h-'xhg) d» GxG (x,h)\ 2 . 
Jg 2 

If we then let f x : G — > R be the function 

f x (h) -/(h^xh) 
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then we conclude (again by Fubini-Tonelli or from consideration of the finitary case) 
that 



= / [J ~ fx[hg) \ d ^(h)j dna(x). 

Expanding out the square, we obtain 

S 2 = 2 J g (\\fx\\h(cf)- J U(h)Mhg) dfi G (h)\ dfi G (x), 

As in Lemma [551 g is uniformly distributed on G. By Lemma [551 we thus conclude 
that for each x € G, we almost surely have 



(22) / f x (h)f x (hg) dn G {h) = { f x dii G y. 

Jg Jg 

By the Fubini-Tonelli theorem, we conclude that almost surely one has (|22|) for 
/XG-almost every x. In particular, we almost surely have 



fx\\ L *{G) ~ I / fx dn G \ ) d/j, G (x). 



We can of course write 



\\fx\\h( G ) ~ \\J X d ^ G ^ = ~~ J f x d VG\W* 



{ay 



At this point it is convenient to pass back to the finitary setting. We can rewrite 
the previous formula for S as 



S 2 = st lim 2E xeG J|/ a!in - Ec^fxAh 



(G») 



where f x ,n{h) := f n (h 1 xh). As all the fibers of the map h i-> h 1 xh have the 
same cardinality, wc have 

E G J x =E(f a \l G J(x). 

Also we observe the identity 

E xeG Jf x - E(U\2 G J(x)\\l 2(Gn) = ||/ n - E(/ n |J G J||2 3(Gn) . 
In summary, we conclude almost surely that 

6 2 = 2st lim ||/ n -E(/ n |I n )||| 2(Gn) , 

and the claim now follows from Lemma l38l □ 

We then have an analogue of Corollary [571 

Corollary 40 (Almost sure relative diagonal mixing, again). Let G, (<? )aeNi H, Xq, X 
be as in Corollary \3T\ Then almost surely, one has the identity 

H-Um ||/ - L g R, g f\\ 2 L2{G ^ G) = 2||/ - E(/|Z g )|| 2 2(g ^ g) 

for all / GL 2 (G,A> G ). 

We make the technical remark that if / is A'-measurable, then the conditional 
expectation E(/|Z) G need not remain A"-measurable. However, both / and E(/|Z G ) 
remain S G -measurable, so the expressions in the above corollary continue to make 
sense. 
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Proof. By an approximation argument, we may assume that / is a (random) simple 
function, and more specifically that it is a finite linear combination with rational 
coefficients of indicator functions of Boolean combinations of finitely many sets of 
the form L g RhE n , where g,h £ (H) and E\, E2, . ■ . generate Xq. There are only 
countably many such functions, so it suffices to verify the claim for a single such 
function. But such a function depends (in a jointly Loeb measurable fashion) on 
only finitely many of the g\,gi, ■ ■ ., and the claim then follows from Lemma l39l □ 

We now combine Proposition I34] Lemma 136] and Lemma [38] to obtain the fol- 
lowing construction of a deterministic IP system H with good properties. 

Proposition 41 (Deterministic construction of a mixing IP system). Let G be an 
ultra quasirandom group, and let Xq be a (deterministic) separable sub -a -algebra of 
Bq. Let E be a Loeb measurable subset of G of positive measure. Then there exist 
a (deterministic) sequence 91,(72, ■ ■ • of elements of G whose associated IP system 

H = {g h . , .g ir : r > 1; 1 < i\ < i 2 < . ■ ■ < h } 

obeys the the following properties, where X is the a-algebra generated by Xq and H : 

(i) (Containment in E) One has H C E. 

(ii) (Left and right mixing) One has 

ff-lim / fL g f dfi G = ([ fdn)(ff dfi) 

9 Jg JG Jx 

and 

ff-lim f fRgf dna = (f f dfi)( f f dfi) 
9 Jg Jg Jx 

for all f,f eL 2 (G,X,f, G ). 

(iii) (Diagonal relative mixing) One has 

ff-lim 11/ - L g R g f\\ 2 L2(G ^ G) = 2||/ - E(/|Z G )|| 2 L2(GtoG) 

for all f e L 2 (G,X,fi G ). 

Proof. If E had full measure, the claim would be immediate from Proposition 1341 
(applied for k = 1, 2, 3, . . .), Corollary[37l and Corollary|40l since a random choice of 

92, ■ • ■ would then almost surely obey all the required properties. Unfortunately, 
this argument does not work directly when hg{E) < I, because the probability 
n(E) 2 _1 appearing in Proposition IMl then decays to zero as k — > 00. Nevertheless, 
one can still proceed in this case by using Proposition l34[ Lemma l36l and Lemma [38l 
to obtain a countable sequence of unitary truncations of Proposition 1411 and then 
appeal to countable compactness to then obtain the full strength of Proposition I4T1 

We turn to the details. Let Xq be generated by Loeb measurable sets Ei,Ei, 

By modifying each Ei (and hence each set in Xq) by a null set if necessary using 
Lemma fiTl we may assume without loss of generality that the E\ , Ei , . . . are internal 
sets. For each k, we may apply each of Proposition [3H Lemma l36l and Lemma [38] 
a finite number of times to locate a finite deterministic sequence (ffi)i=i = (ffi^)i=i 
of group elements obeying the following properties: 

(i) (Truncated containment in E) One has 

{fifii ■ • • 9i r '■ r > 1; 1 < *i < *2 < • • • < V < k} C E. 
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(ii) (Truncated left and right mixing) For any 1 < k' < k, any simple functions 
/, /' that are linear combinations of at most k indicator functions, each 
of which are boolean combinations of at most k sets of the form LgR^Ei, 
where i < k, and g, h are words of length at most k in g± , . . . , ffw 1 , with the 
coefficients of the linear combinations being rational with numerator and 
denominator bounded in magnitude by k, we have 

/ fL g f dfiQ = ( / fdvt)([ fdfi) 

J G J G JG 

and 

/ fR g fd f i G = ([ fd[x)([ f dp) 

JG JG JG 

whenever g — g^ . . . gi r for some r > 1 and k' < i\ < . . . < i r < k. 

(iii) For any 1 < k' < k, and / and g are as in (ii), we have 

||/ - L g R g f\\ 2 L 2 (GBGtIG) = 2\\f - E(/|X G )|| i2(G!)Bo)/to j. 

One can verify that for each k, the property of (gi, . . . ,<?/.) € G k obeying the 
above properties describes a countably closed subset of G k . (Here we are implicitly 
using the continuous nature of the standard part function.) On the other hand, 
from Lemma [14l and Lemma [TBI we know that the space G N of sequences (<? n )neN is 
countably compact. We thus conclude the existence of an infinite sequence g\,g%, . . ■ 
in G, such that the finite truncations g±, . . . , gk obey the above truncated properties 
for each k. By repeating the limiting arguments in Corollary l37l and Corollary l40l 
we then obtain the desired properties (i)-(iii) for Proposition 1411 □ 

As a consequence of this proposition and Theorem I25[ we can obtain an ultra- 
product version of Theorem 

Theorem 42 (Relative weak mixing, ultraproduct version). Let G be an ultra 
quasirandom group, and let /i,/2,/3 € L°°(G,Bg, A*g)- Then one has 

f h(L g f 2 ){L g R g h) dn G = {[ f 2 f /iE(/ 3 |2b) duo) 

JG JG JG 

for [iQ-o^fnost all g G G. 

Proof. First suppose that is Immeasurable. Then L g R g f^ = for every g, and 
the claim then follows from Lemma [35] (replacing f\ by /1/3). Thus we may assume 
without loss of generality that E(/ 3 |2g) = 0, and the task is now to show that 

f f x {L g f 2 ){L g R g h) d MG =0 

JG 

for jU G -almost all g € G. 

Suppose this is not the case. Then one can find a Loeb measurable set E C G 
of positive Loeb measure, and an e > 0, such that 

(23) I f f x {L g f 2 ){L g R g h) d^ G \>e 

JG 

for all g £ E. We then apply Proposition HU with X$ equal to the (separable) 
a- algebra generated by f\,fa,fz, to find an IP system H inside E obeying all the 
conclusions of that proposition. 
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Let (X,X,p) be the restriction of (G,Bg, Pg) to the er-algebra X generated by 
Xq and the shifts L g , R g for g £ H . We now verify the hypotheses of Theorem l25l 
From Proposition I41f ii) we have 

ff-lim f fL g f dp = (f f dp){ / /' dfi) 

9 Jx JX JX 

and 

£T-lim f fR g f dp = ([ f dp)( f f dp). 
9 Jx Jx Jx 

for all /, /' £ L 2 (X,X,p), which are the hypotheses (i) and (ii) for Theorem 1251 

Now we verify hypothesis (hi) for Theorem [25] Suppose that / £ L 2 (X,X,p) 

be any function with the rigidity property that, for any e > and natural number 

n, there exists g £ H n such that \\L g R g f — J\\l 2 (x,x .^) < £ - by Proposition HTTiii') . 

we conclude that 

||/-E(/|X g )|| l2(G)Bg , /1g) = ) 

thus / is Xq measurable up to /ic-almost everywhere equivalence. Since E(/3|Xg) = 
0, we conclude that J x ffc dp — 0, giving hypothesis (hi) for Theorem[25] We may 
then apply Theorem [551 to find g £ H such that 

| / h{L g h){L g R a h) dp\<e. 
Jx 

But this contradicts (1231) and Proposition WNi) , and the claim follows. □ 

Remark 43. One can reformulate the conclusion of Theorem as the assertion 
that the pushforward of the Loeb measure pq2 to G 4 under the map (x, g) i— > 
(g, x, xg, gx), when restricted to the product cr-algebra Bq x Bg X Bq X Bg, is equal 
to pg x A*g x (mg x i g Mg)j where pc Xj G pg is the relative product of the measures 
Pg with respect to the factor Xq- 

Finally, we can use the ultraproduct correspondence principle to recover Theorem 

m 

Proof of Theorem [31 A simple change of variables reveals the identity 

ExeGfi(x)f2(xg)f 3 (gx) = / f 2 (L g f 1 )(L g R g f 3 ) dp G 

Jg 

for any finite group G and functions fx,fz,fi ■ G — > R. Thus, it suffices to show 
that for any £>-quasirandom group G, and any functions /i,/2,/3 : G — > [—1,1], 
one has 

/ I / h{L g fi){L g R g h) dpG-([ fi dp G ){ f / 2 E(/ 3 |I G ) dp G )\ dp G (g) < c(D) 
Jg Jg Jg Jg 

for some c{D) going to zero as D — > oo. 

Suppose for sake of contradiction that this claim failed. Carefully negating the 
quantifiers, we may then find an e > and a sequence G n of finite groups and 
functions /i, n , /2,n, /3,n : G n — > [— 1, 1], such that for each n, G n is n-quasirandom 
and 
(24) 

/ 1/ f2.n( L n.g n fl,n)(Ln,g„Rn,g„f3 n )dp Gn -( fl n dp G J{ /2 n E(/ 3n \X G J d(JL G J \ dp Gtk (p n ) > E. 

JG n ^G„ JGr, JG a 
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If we now form the ultraproduct G := Iln-i-a ^ n ana - ^ nc functions ft := st lim n _j. Q /i, n , 
then G is an ultra quasirandom group and /i,/2,/3 G L°°(G,Bg, Hg)- Taking ul- 
tralimits of (PM1) (and using Lemma [55)1 we see that 

/ I / f 2 (L g f 1 )(L g R g f 3 ) duo - ( / A d/iG)( / / 2 E(/ 3 |X G ) d/i G )| dfx G (g) > e. 
jg Jg Jg Jg 

But this contradicts Theorem l42l and Theorem [5] follows. □ 

6. Proof of Theorem mil 

We now give two proofs of Theorem 1101 a combinatorial proof, and a proof 
using the ultraproduct correspondence principle. In both proofs, we use the fact 
that in a finite group G with a subset A (Z G, the number of pairs (x, g) £ G 2 with 
x,xg, gx £ A is also equal to 

(25) \G\ 2 f U(a6)U(aca- 1 )U(fecfe- 1 )d/i G 3(a,6,c), 

J G 3 

as can be seen by applying the |G|-to-one change of variables (x, g) :— (ab, bc~ 1 a). 
Note that each of the three factors lyi(a&), l^aca -1 ), 1a(&c& -1 ) depends on only 
two of the three variables a, b, c. 

We begin with the combinatorial proof. The main tool is the triangle removal 
lemma of Ruzsa and Szemeredi: 

Lemma 44 (Triangle removal lemma). For every 5 > there exists e > such 
that if G = (V, E) is a graph on n vertices with at most en 3 triangles, then it is 
possible to remove fewer than Sn 2 edges from the graph to obtain a graph with no 
triangles whatsoever. 

Proof. See [29] . The main ingredient of the proof is the Szemeredi regularity lemma 
[31]. □ 

Now we prove Theorem [lOl Let 6 > 0, and let e > be sufficiently small 
depending on 5. Suppose for contradiction that we can find a finite group G and 
a subset A of G with \A\ > S\G\, and such that there are at most e|G| 2 pairs 
(cc, g) £ G x G with x.gx,xg £ G; using the formula (|25|) . we conclude that 

(26) E a , hiCeG l j4 ( a 6)l j4 (aca- 1 )l yl (6c6~ 1 ) < e. 

Now consider the tripartite graph (V, E) with V — G x {1, 2, 3}, and E give by the 
following edges: 

• If (a, 1), (6, 2) e V are such that ab £ A, then {(a, 1), (6, 2)} £ E. 

• If (a, 1), (c, 3) £ V are such that aca^ 1 £ A, then {(a, 1), (c, 3)} £ E. 

• If (6, 2), (c, 3) G V are such that fodr 1 £ A, then {(6, 2), (c, 3)} G E. 

• There are no further edges. 

From ([26]) we see that (V,E) contains 3|G| vertices and at most e|G| 3 triangles. 
Applying Lemma EH1 we see (for e small enough) that one can remove all the trian- 
gles from (V,E) by deleting fewer than <5|G| 2 edges. On the other hand, each pair 
(a, b) £ G with ab £ A leads to a triangle in (V, E) with vertices (a, 1), (6, 2), (ba, 3). 
There are at least |A||G| > 8\G\ 2 such triangles, and the edges in these triangles 
are all disjoint, and so at least <5|G| 2 edges need to be deleted in order to remove 
all triangles. This gives the desired contradiction, and Theorem [TOl follows. □ 
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Remark 45. The above argument in fact gives a quantitative value for e which is 
of tower-exponential type with respect to S. It would be of interest to obtain any 
improvement to this bound. 

Now we give the ultraproduct proof. We can deduce Theorem [10] from its ultra- 
product version: 

Theorem 46 (Strong recurrence, ultraproduct version). Let G be an ultraproduct 
of finite groups, and let A be a Loeb measurable subset of G with hg{A) > 0. Then 

lA{ab)lA{aca~ 1 )lA{bcb~ 1 )d/j,Q3(a 7 b,c) > 0. 

G 3 

The derivation of Theorem ITOl from Theorem l46l (using (|25j) ) is a routine (and 
simpler) variant of the derivation of Theorem |9] from Proposition 1221 or Theorem [5] 
from Theorem and is omitted. 

We will need the following ultraproduct variant of the triangle removal lemma: 

Lemma 47 (Ultraproduct triangle removal lemma) . Let V\ , V2 , V3 be the ultraprod- 
ucts of finite non-empty sets, and let ^12,^23,^13 be Loeb measurable subsets of 
Vi x V2, V2 x V3, Vs x Vi respectively. Suppose that 

l Al2 (a, b)l A23 (b, c)l Al3 (a, c) dfi G 3 (a, b, c) = 0. 

Vi x v 2 x v 3 

Then for any s > 0, there exist Loeb measurable subsets A' 12 , A' 23 , A' 13 of V\ x V 2 , 
V2 X V3, V3 x V\ respectively respectively with 

(27) 1^(0,6)1^(6,0)1^(0,0) =0 
for all a £ Vi, 6 € V2, c G V3, and 

(28) nvixViiAijAA^Ke 
for ij = 12,23,13. 

This lemma was proven in |15] and |32) : for the sake of completeness, we give a 
proof later in this section. Assuming this lemma for now, let us conclude the proof 
of Theorem 1461 Suppose for contradiction that 



(29) / lA(ab)l A (aca- L )l A (bcb- L )dn G3 {a,b,c) = 0. 
Jg 3 

Let e > be chosen later. Applying Lemma |47l we can find Loeb measurable 
subsets A' 12 , A' 23 , A' 13 of G 2 such that 

(30) ii & ({(a, 6) e G 2 : ab £ A; (a, b) £ A' 12 }) < e 

(31) (j,g 2 ({(a, c) e G 2 : aca' 1 e A; (a, c) $ A' 13 }) < e 

(32) Mg^({(6, c) G G 2 : bcb' 1 € A; (6,c) g A' 23 }) < e. 
and 

^A' 12 (a,b)l A ' 13 (a,c)l A ! 23 (b,c) = 
for all a, 6, c € G. In particular, one has 



(33) / 1^(0,6)1^(0,60)1^3(6,60) d/i G 2 (a, 6) = 0. 



G 2 
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Using (f3T|) , (|32j) and the change of variables c = ba, we see that 
fi G 2({(a,b) e G 2 : ab; (a,ba) g A' 13 ) < e 
Mg 2 ({(«, b) e G 2 : ab e A; (b, ba) # A' 23 ) < e 
and from these bounds, (|30]) . and f[33]) we conclude that 



1a(o6) d/j,Q2(a,b) < 3e 

and thus /i(-A) < 3e. Since fi(A) was assumed to be positive, we obtain a contra- 
diction for e small enough, establishing Theorem 051 an d hence Theorem [TU1 

Now we prove Lemma 1471 this will be the standard proof of Lemma 1441 con- 
verted into ultraproduct form. Observe that for each c € V3, the function (a, b) t— > 
l J 4 13 (a,c) 1a 23 {b, c) is measurable with respect to the product cr-algebra By 1 X £?y 2 . 
Thus we have 

/ 1 A 12 (a,b) 1 A13 c) 1 a 23 ( b > c)d^ Vl x y 2 (a, 6) 

E(U 12 |£y-i x Sy 2 )(a,&)lA 13 (a,c)lA 2 3(fo,c)d^y lX y2(a,fo). 

VixV 2 

Integrating in c using the Fubini-Tonelli theorem (Theorem I19p we conclude that 
E(l Al2 \B Vl x Bv 2 ){a,b)l Al3 (a,c)lA 23 (b,c)dn VlX v 2 xv 3 {a,b,c) = 0. 

'Vixy 2 xv 3 

Arguing similarly using the other two factors, we conclude that 

/ /i 2 (a,6)/i 3 (a,c)/23(&,c) dfJ,y 1 xV 2 xV 3 (a,b,c) = 

./VixVixVa 

where fa := E(l Aij \B Vi x B Vj ) for ij = 12,13,23. 

For each ij = 12, 13, 23, let I y C G 2 be the set i. y := {ieV,x Vj : fa > e/4}. 
Since we have the pointwise bound 

< l Aij < e- 1 /, 

we conclude that 



(34) / l Ai2 (a,b)l A (a,c)l A23 (b,c) dfx Vl xV 2 xV 3 (a,b,c) = 0. 

Jv 1 xV 2 xV 3 

For each ij = 12, 13, 23, the sets Aij are measurable with respect to the product 
topology Bvi x Bvj, which is generated by product sets Ei x Fj for Ei G By i , 
Fj G By.. Approximating Ay to error e/4 by a finite combination of these sets, we 
can find finite sub-er-algebras B' Vi i j,B Vj i j of By i ,By j respectively such that 

ll 1 ^, ~ ^Ai^ViM X B 'v j ,ij)\\L^{V i xV i ,B Vi xB Vj ,iJi.v i xv j ) ^ e / 4 ' 

By combining the finite factors together, we thus obtain a single finite factor B' v . 
for each i = 1,2,3 with the property that 

(35) \\l Ai . -E(l Aii \B' V( x B , Vi )\\ L i ( y i xv i ,Bv i xBv J , l >v tXYj ) < e/4 

for ij = 12, 13, 23. By absorbing atoms of zero measure, we can assume that all 
atoms in the By have positive measure. 
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For ij = 12, 13, 23, let A'^ be the restriction of Aij to those atoms Ei x Fj of 
B' v . x By for which 

(36) E(l Ai .\B' Vi x B' v .) > 2/3. 



We claim that ([27J holds for any a G V\, b G V2, c G V 3 . Indeed, let Si,S 2 ,S 3 
be the atoms of B' V ,B' V ,By containing a, 6, c respectively. From (|36| and the 
Fubini-Tonelli theorem, we see that the sets 

{(a,b,c) G E x x E 2 x S 3 :(a,6) G i i2 } 

{(a, 6, c) G E x x S 2 x S3 :(a, c) G I13} 

{(a, 6, c) £ £1 x £2 x S 3 :(&, c) G i 23 } 

each have density greater than 2/3 in E\ x S 2 x S3, and hence the set 

{(a, 6, c) G Si x S 2 x S 3 : (a, 6) G ii 2 ; (a, c) G ii 3 , i 23 } 



has positive measure, contradicting (|34]l . This establishes (|27|) . 

Finally, we need to show (f2"5|). For sake of notation, we show this for ij = 12, as 
the other two cases are analogous. By construction, the set Ai 2 AA' 12 is contained 
in the set 

{(a,b) G A 12 : / 12 < e/A}U{(a,b) G A 12 : E(1^JB^ x B^) > 2/3} 

and so 



LlV 1 xV 2 (Ai 2 AA' 12 ) < / l J 4 12 l/i2<e/4 + l Ai2 l E(lj 1 ,|B{, 1 xi3' )>2/3 d^y lX y 2 . 

JV1XV2 12 1 2 

1 as the sum of 

/l2l/ia<e/4 d^VixV 2 



'V 1 xV 2 

The right-hand side can be written as the sum of 



Vixv 2 



and 



J vxv 1 A 12 1 i Al2 -mA 12 \B' Vl xB> V2 )<i/3 dfi Vl xv 2 - 



IV 1 xV 2 

The first integral is at most e/4, while the second expression is at most 

3 II 1 A 12 -E^liJ^Vi X ^y 2 )IU 1 (^xV J -,B v - i xBv 3 ,Mv iX v,.) 

which by (j3"5| is at most 3e/4. The claim (|2"5)) follows. 

Now we can prove Theorem [TTJ We first observe that this theorem follows from 
an apparently weaker version in which the final condition (gxig -1 , . . . , gxtg -1 ) G A 
is deleted. Namely, we will deduce Theorem [TT] from the following result: 

Theorem 48 (Multiple strong recurrence). Let k > 1 be a natural number. For 
every 5 > 0, there exists e > such that the following statement holds: if G is a 
finite group, and A is a subset of G k with \A\ > <5|G|' C , then there exist at least 

e|G| fc+1 tuples (g, x\, . . . , x&) G G k+1 such tfeari (ax-\ axi. a;«+ 1 xy) G A 

for all i = 0, . . . ,k. 



12 Recall that by the convention of ignoring the initial block gx\, . . . ,gxi when i = and the 
final block asi+i, ... ,Xk when i = k, we interpret (gxi , . . . , gxi, Xi+i, . . . , x^) as (asi, . . . , x^) when 
i = and (gxi, . . . , gx^) when x = k. 
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Indeed, if k, 5, e, A are as in Theorem I 111 one can conclude that theorem by 
applying Theorem 08] with k replaced by k + 1 and A replaced by the set 

{(x 1 ,...,x k +i) E G : (x^^, . . . , XkX^) € A}; 

we leave the routine verification of this implication to the reader. 

We also remark that a variant of Theorem 05] can be proven by the arguments 
used to establish (TT] Corollary 6.4], as noted in the comments after that corollary. 
In this variant, the conclusion is instead that there exist at least e|G| fc+1 tuples 
(g,xi,..., x k ) e G k+1 such that (x\, . . . , Xk) e A and (xi, . . . ,Xi-i,gxi,x i+1 ,. ..,x k )e 
A for alH = 1, . . . , k. 

We now prove Theorem 08] We will generalize the combinatorial proof of Theo- 
rem 1101 by replacing the triangle removal lemma of Ruzsa and Szemeredi with the 
more general hypergraph removal lemma first established in [28], [27], [20) . (The 
measure-theoretic proof also generalizes, but we leave this as an exercise to the 
interested reader.) We will use the following special case of this lemma: 

Lemma 49 (Simplex removal lemma). Let k > 1 be an integer. For every 5 > 
there exists e > such that if Vq, . . . ,Vk are se ^ s °f n vertices, and for each i = 
0, . . . , k, Ei C Vo X . . . x Vi-i x Vi+\ x . . . x Vk is a set with the property tha¥^ 

k 

(37) J|l_E i (a;o,...,a;j_i,Xi + i,...,Xfe) < en k+1 , 

x ev ,...,x k £V k i=o 

then it is possible to remove fewer than 5n k elements from Ei for each i = 0, . . . , k 
to form a subset E[ such that 

k 

^ ^l E '.{x , ■ ■ ■ ,Xi^\,x i+1 , . . . ,x k ) =0. 
x ev ,...,x k ev k i=o 

Proof. This is a special case of j33j Theorem 1.13]. □ 

Let k, 5 be as in Theorem 08] let e > be a sufficiently small quantity, and let 
G, A obey the hypotheses of Theorem 08] Suppose for contradiction that there are 
fewer than e|G| fe+1 tuples (g,x±, . . . ,x k ) £ G k+1 such that (gxi, . . . , gxi, Xi+%, . . . , x k ) € 
A for alH = 0, . . . , k. 

For each i = 0, . . . , k, we set Vi :— G, and then let Ei C G k be the set of all 
tuples (xq, * . * , Xi-\,Xi+\, . . . , Xk) G G h with the property that the fc-tupkQ 

(xo , xqX\ : ■ • ■ : • • * Xi— i , Xfe . . . x^_i_-^ , . . . , Xfe x fo_-^ 7 x ^ ) 

lies in A. For instance, if k = 3, we have 

Eq = {(xi,x 2 ,x 3 ) e G 3 : (xj 1 x 2 ~ 1 x^ 1 ^x^x^ 1 ^x^ 1 ) G ^4} 

E\ = {{x ,X2,x 3 ) € G 3 : (x , x 3 1 x% 1 , x 3 x ) G A} 

E-2 = {(x ,xi,x 3 ) G G 3 : (x ,x xi,x 3 1 ) € A} 

E3 = {{x , xi, x 2 ) € G 3 : (x , zo^i, x^x^) G A}. 

^Continuing the previous block- ignoring convention, we interpret (xo, . . . , Xi—i, Ij+i, ■ • ■ , x k ) 
as (xi, . . . , x k ) when i = and (so, . . . , Sfc— 1) when i = k. 

Continuing previous conventions, we ignore the block xo, xqxi, . . . , xq . . . Xi-\ when i = 0, 
and x^ 1 . . . x~^ 1: . . . , x^ 1 x^ 1 ,x^ 1 when i = k. 
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Now suppose that (xq, . . . ,Xk) makes a non-zero contribution to the left-hand side 
of ([37]), thus 

(xq , XqXi , . . . , Xq . . . Xi — \ , Xfc ... 3 • • • 5 Xfc X X ^ ) G ^4 

for alH = 0, . . . , k. If we then define 

Vi '■= x k ... x i 

for i = 1, . . . , k, and 

g := x . . .Xfe, 

we conclude that 

(gyi,---,gyi,yi+i,---,yk) e -4 

for z = 0, . . . , k. From our hypotheses, we conclude that (|57|) holds. Applying 
Lemma l49l (with 5 replaced by S/(k + 1)), we conclude (for e small enough) that 
we can remove fewer than |rpj-|G| elements from Ei to create a subset E' i: with 
the property that there do not exist any tuples (xq, . . . , Xk+i) G G k+1 with the 
property that (xq, . . . , Xi-i, Xi+i, . . . , Xk) <E E[ for all < i < k. 

Let (yi, . . . , yk) be an element of A. Applying the previous claim with 

(x , ...,x k ):= (yo 1 yi,yi 1 V2, ■ ■ .,y k 1 y k +i) 

with the convention that j/o = J/fc+i = 1, we see that there is at least one < i < k 
such that 

(vo^^yi, ■ • • , y^yi-i) y7 1 Vi+ii ■■•■> y^yu+i) <£ K 

(using the same block- ignoring conventions as before). On the other hand, from 
definition of Ei and the hypothesis (j/i, . . . , yk) G A, we see that 

(% V, •••,y 4 - 1 2^-i,y 4 ~Wi,---,2/fc V+i) e £ 4 . 

Applying the pigeonhole principle, we conclude that there exist < i < k such that 

(% V, ■■, yl\yi-i, y 4 ~ Wi, v^Vk+i) e Ei\E[ 

for at least \A\/(k + 1) > tuples {y x , . . . , y k ), thus \Ei\E<\ > ^f\G\. But 

this contradicts the construction of E[, and Theorem H51 follows. □ 

7. Remarks on specific ultra quasirandom groups 

In this section, a € /3N\N is a fixed non-principal ultrafilter. 

In Section [5l some general mixing properties were obtained for arbitrary ultra 
quasirandom groups. It turns out that for some specific examples of ultra quasiran- 
dom groups, one can obtain further mixing properties, particularly for ultraproducts 
of the finite groups SL2(F p ), the mixing properties of which have been intensively 
studied. Indeed, thanks to the existing literature on such groups, we have the 
following results: 

Theorem 50 (Mixing properties of SL2(F p )). Letp n be a sequence of primes going 
to infinity, let F be the characteristic zero pseudo-finite fielq^j F := Iln-xi E Pn , and 
let G be the ultra quasirandom group G :— SL2(F) — Yl n ^ a SL2(F Pn ). 



In model theory, a pseudo-finite field is a field which obeys all first-order sentences in the 
language of fields that are true in all finite fields (for instance, a pseudo-finite field has exactly 
one field extension of each finite degree). In particular, any ultraproduct of finite fields is a 
pseudo-finite field. 
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(i) (Weak mixing) G has no non-trivial finite-dimensional unitary representa- 
tions; thus, for any d > 1, the only homomorphism from G to Ud{C) is the 
trivial one. 

(ii) (Almost sure expansion) There is an absolute constant e > with the prop- 
erty that for [i G 2 -almost every pair (a, b) € G, one has the spectral gap 
property 

(38) \\h,L a +L b + L a -, +£ 6 -i)||op < 1-e, 

where |||| op denotes the operator norm on the space I?(G, Be, A*g)o of mean 
zero functions in L 2 (G,Bc, Ho)- 

(iii) (Uniform expansion in most cases) There exists a universal subset A of the 
primes of density zero, such that if the primes p n all avoid this set, then 
the spectral gap property (|38|) holds for all pairs (a, 6) G G which generate 
a Zariski-dense subgroup of G. 

(iv) (Uniform expansion in a SL2CZ1) component) Identifying Z with the subring 
generated by the identity 1 of F, the spectral gap property (|38p holds for 
any (a, &) G SL2CZ) generating a Zariski-dense subgroup o/S*L2(Z). 

(v) (Lack of mild mixing^) For any a G G, there exists a Loeb-measurable 
subset E of G such that p,(j(E) = 1/2 and L a E — E. (In particular, 
pL G (L a ^E n£)/> Hg(E) 2 as n 00. 

Of course, similar results hold if the left shift L g is replaced with the right shift 
R g throughout. Among other things, the uniform expansion properties of the ultra 
quasirandom group G established in the above theorem suggest that this group 
behaves very "non- amenably" . In general, it appears that ergodic theory tools that 
are restricted to amenable group actions are not suitable for the analysis of ultra 
quasirandom groups. 

Proof. We begin with (i). This does not seem to follow directly from Lemma I3"2l 
but can be deduced from modifying the proof of that lemma, as follows. Suppose 
for contradiction that we have a non-trivial representation p : SL2(F) — > Ud(C) on 
a unitary group of some finite dimension d. Set a to be the group element 



a :- 



1 1 
1 



and suppose first that p(a) is non-trivial. Arguing as in the proof of Lemma 1321 
we see that the eigenvalues of p{a) are permuted by the operation x 1— > x m for any 
perfect square m G N, because a is conjugate to a m . In particular, this implies that 
all the eigenvalues are roots of unity; clearing denominators, we see that p(a m ) = 1 
for some perfect square m G N, and hence p(a) = 1. Conjugating again, this time 
by the diagonal matrix with entries m, m _1 for a non-zero F, we see that 

for all m G F. In each finite field F Pn , it is a classical fact that every residue class 
is the sum of three quadratic residues; taking ultraproducts, the same claim is true 



-^A measure-preserving system is said to be mild mixing if there are no non-trivial rigid 
functions; in the case of actions of abelian groups, this concept is intermediate in strength between 
weak mixing and strong mixing. 
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in F. As p is a homomorphism, we thus see that 



for any t £ F. By conjugation, we thus also have 




These two one-parameter groups of matrices are easily verified to generate SL2 (F) , 
and so p is trivial, giving the desired contradiction. 

Now we establish (ii). We will use (as a black box) one of the main theorems 
[131 Theorem 2] of Bourgain and Gamburd, which in our notation asserts that that 
for each prime p n there exists an exceptional set E n of SL2(F Pn ) x SLz^FpJ) of 
density ^sL 2 (F Pa )xSL 2 {F Pn ){ E n) going to zero as n — > 00, and an absolute constant 
e > 0, such that 



for all (a n , b n ) G SL 2 {F Pn ) x SL 2 {F Pn )\E n . If we let E := U n ^ a E n, then E is a 
null subset of G x G, and for any (a, 6) G G x G\£', we see upon taking ultralimits 
that 

\\j{L a + L b + L a -i + L b -i)f\\ L 2 {GBGllG) < (1 - £)||/||l 2 (G,8g,A<g) 

whenever / is the standard part of a bounded internal function of mean zero, giving 
({38} for ^GxG-ahnost all (a,b), as required. 

In a similar vein, to prove (iii) we use the main result of Breuillard and Gamburd 
[14] which shows that there exists a subset A of the primes of zero relative density, 
such that if p n avoids A, then the spectral gap (j3U)) holds whenever a n and b n 
generate SL2 The classification of all proper subgroups of SL2(F Pn ) are 

classical, and it is known that all such subgroups either have size 0(1) or else are 
contained in a group containing a conjugate of the Borel subgroup 



with index O(l). Taking ultraproducts, we see that if (a,b) G G x G is such that 
(|38[) fails, then a, b either lie in a finite subgroup of G, or a group containing a 
conjugate of the Borel subgroup B(F) with finite index. In either case, a, b lie in a 
proper algebraic subgroup of SL2, and the claim (iii) follows. 

The claim (iv) follows very similarly from |T51 Theorem 1] and is left to the 
reader, so we turn to (v). Let a = lim n _ ! . Q a n be an element of G, so that a n G 
SL2(F Pn ) for an a-large set of n. For each n, we consider the cyclic subgroup (a n ) 
of SL-2{F Pa ) generated by a n . This is an abelian subgroup of SL2(F Pii ), and as 
such can easily be verified to have cardinality 0(p n ). In particular, the index of 
(a n ) in SL>2(F Pii ) (which has order comparable to p^) goes to infinity as n — > 00. 
As such, one can form (for an a-large set of n) a subset E n of SL2 (ip„) which is 
the union of right cosets of (a n ), and whose density psL 2 (F Pn )(E n ) converges to 1/2 
as n — > 00. Setting E :— Yl n ^ a E n , we obtain the claim. □ 

We do not know if all ultra quasirandom groups obey the conclusion (i) of the 
above proposition. However, all ultra quasirandom groups obey (v), because one 



(39) 



\\-(L aa + L ba + L a -i +L b - 1 )\\ op < 1-e, 
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can show that the index of any subgroup H in a D-quasirandom group G is at least 
D (otherwise the quasiregular representation on L 2 (G/H) would be too small of a 
dimension), and one can run the argument used to prove (v) above to handle the 
general case. The uniform expansion conjecture asserts that the set A in (hi) can 
be deleted, thus unifying (ii)-(iv), but this conjecture remains open. 

Finally, we observe that the failure of mild mixing that occurs in Theorem l50f v) 
also occurs for other limits of finite groups than ultra quasirandom groups. We 
illustrate this with the infinite alternating group Aoo, which is the direct limit of 
the finite alternating groups A n . (This result is not used elsewhere in the paper.) 

Proposition 51 (Failure of mild mixing). Let Soo be the group of all bijections 
of N that fix all but finitely many natural numbers; this group can be viewed as 
the union of the finite permutation groups S n , which are the subgroup which fix all 
natural numbers but {1, . . . , n}. Let Aoo be the index two subgroup of Soo consisting 
of the union of the alternating groups A n . Then there exists an ergodic action of 
Aao on some probability space (X,X,\x), a seguence 31,32, ■ ■ ■ of distinct elements 
of Aaa, and a subset E of X of measure fi(E) — 1/2 such that g n E — E for all n. 

Proof. (Sketch) Let G := Jln^a Thanks to the nesting of the S n , G naturally 
contains an embedded copy of [J n S n = Soo and hence Aoo ■ Thus Aoo acts on the 
probability space (G,Bg,^g) by left shift. This system itself is not ergodic; for 
instance, if E n denotes the set of permutations in S n that map an odd number to 
n, then one can verify that the ultraproduct E := Jln^a ^ n nas Loeb measure 1/2 
but is invariant up to null sets by the action of Soo- However, we can create ergodic 
factors of this action as follows. Let B n denote the set of permutations in S n that 
map an odd number to 1, and let B :— J\ n _ yoo B n be the ultraproduct. One easily 
verifies that hg{B) = 1/2. Let (X, X, /u) be the factor of (G, B, fie) generated by B 
and the Aoo action, thus X — G, // is the restriction of /iq to X, and any set in X 
can be approximated to arbitrary accuracy in [iq by a finite boolean combination 
of shifts L g B of B with g G Soo- Any such boolean combination is a set F with 
the property that the membership of a given permutation er = lim n _> Q cr n G G in F 
depends only on the parity <r -1 (i) mod 2 = lim n _>. a o~~ (i) mod 2 of preimage of a 
finite number of natural numbers i. Because of this, we will have the Bernoulli-type 
mixing property /iG(L g F n F') = ^lg{F)^g{F') for any such boolean combinations 
F, F' , provided g G Aoo maps a certain finite set of natural numbers to sufficiently 
large values. Indeed, a simple counting argument shows that if M is a fixed natural 
number and cr n is chosen at random from A n , then the parities of the M quantities 
of tr~ 1 ({l}), . . . , c^d-M}) behave like independent Bernoulli variables in the limit 
n — > 00, giving the claim. This demonstrates ergodicity of the Aoo action on 
(X, X, /j,). On the other hand, if g is any permutation that fixes 1, then E is fixed 
by L gi and so mild mixing fails. □ 
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